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Abstract 

pH ■ Testing restrictions on regression coefficients in linear models often requires correcting 

^/y i the conventional F-test for potential heteroscedasticity or autocorrelation amongst the dis- 

turbances, leading to so-called heteroskedasticity and autocorrelation robust test procedures. 
These procedures have been developed with the purpose of attenuating size distortions and 
power deficiencies present for the uncorrected F-test. We develop a general theory to establish 
positive as well as negative finite-sample results concerning the size and power properties of a 
large class of heteroskedasticity and autocorrelation robust tests. Using these results we show 
that nonparametrically as well as parametrically corrected F-type tests in time series regression 
models with stationary disturbances have either size equal to one or nuisance-infimal power 
equal to zero under very weak assumptions on the covariance model and under generic condi- 
tions on the design matrix. In addition we suggest an adjustment procedure based on artificial 
^/-\ regressors. This adjustment resolves the problem in many cases in that the so-adjusted tests 

do not suffer from size distortions. At the same time their power function is bounded away 
from zero. As a second application we discuss the case of heteroscedastic disturbances. 
AMS Mathematics Subject Classification 2010: 62F03, 62J05, 62F35, 62M10, 62M15 
Keywords: Size distortion, power deficiency, invariance, robustness, autocorrelation, het- 
eroscedasticity, HAC, fixed-bandwidth, long-run-variance, feasible GLS 
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1 Introduction 



H 

So-called autocorrelation robust tests have received considerable attention in the econometrics lit- 
erature in the last two and a half decades. These tests are Wald-type tests which make use of an 
appropriate nonparametric variance estimator that tries to take into account the autocorrelation 
in the data. The early papers on such non parametric variance estimator s in econometrics date 



from the late 1980s and early 1990s (see, e.g-. lNewev and Westl (|l987lll994l ). lAndrewsl (|l99ll ). and 



Andrews and Monahanl ( 1992T )) and typically consider consistent variance estimators. The ideas 



* Parts of the results in the paper have been presented as the Econometric Theory Lecture at the International 
Symposium on Econometric Theory and Applications, Shanghai, May 19-21, 2012. 
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and techniques underlying this literature de rive from the much earl i er lite r ature on spectr al es- 
timation and can be traced back to work by iBartlettl (|1950l ). Ijowettl (J1955I) . Irlannanl (J1957I) . and 
Grenander and Rosenblatt! ( 19571 ) , the latter explicitly discussing what woul d now be ca l led au tocor- 
relation robust tests and confidence intervals (Section 7.9 of I Grenander an d Roscnbla ttl (11957)). Fo r 
book- length treatments of spectral estimation see the classics Irlannanl (J1970I ) or lAndersonl ( 19711 ). 



Autocorrelation rob ust tests for the location param ete r also play an important role in the field of 
simulation, see, e.g.. lrleidelberger and Welch! (|198ll ) or iFlegal and Jonej (|2010l ). In a similar vein, 



so-called het e roske dasticity robust variance estimators and associated tests have been invented by 
Eickerl ([ 19631 . 119671 ) and have later been introduced into the econometrics literature. As mentioned 
before, the autocorrelation robust test statistics considered in the above cited econometrics litera- 
ture employ consistent variance estimators leading to an asymptotic chi-square distribution under 
the null. It soon transpired from Monte Carlo studies that these tests (using as critical values the 
quantiles of the asymptotic chi-square distribution) are often severely oversized in finite samples. 
This has led to the proposal to use a test statistic of the same form, but to obtain the critical values 
from another (nuisance parameter-free) distribution which arises as the limiting distribution in an 
alternative asymptotic fra mework ("fixed bandwidth asymptotics") in which t he varia nce estimator 
is no longer consistent, see lKiefer et al.l (|2000l ) . iKiefer and Vogelsang! (|2002a b. 2005). Th e idea of 
using "fixed bandwidth asymptotics" can be traced back to earlier work by iNeavd (|19T0h . Monte 
Carlo studies have shown that these tests typically are also oversized, albeit less so than the tests 
mentioned earlier. This improvement, however, is often achieved at the expense of some loss of 
power. In an attempt to better understand size and power properties of autocorrelation robust 
tests, h igher-order asymptotic properties of these tests have been stud ied (jVelasco and Robinson 
( 200lh . IJansson] (|2004 . ISun et al.l (EM l201lh . Izhang and Shad (|2013h . 

The first-order as well as the higher-order asymptotic results in the literature cited above are 
all pointwise asymptotic results in the sense that they are derived under the assumption of a fixed 
underlying data-generating process (DGP). Therefore, while these results tell us something about 
the limit of the rejection probability, or the rate of convergence to this limit, for a fixed underlying 
DGP, they do not necessarily inform us about the size of the test or its asymptotic behavior (e.g., 
limit of the size as sample size increases) nor about the power function or its asymptotic behavior. 
The reason is that the asymptotic results do not hold uniformly in the underlying DGP under the 
typical assumptions on the feasible set of DGPs in this literature. Of course, one could restrict 
the set of feasible DGPs in such a way that the asymptotic results hold uniformly, but this would 
require the imposition of unnatural and untenable assumptions on the set of feasible DGPs as will 
transpire from the subsequent discussion; cf. also Subsection l3.1.21 

In Section [3] of the present paper we provide a theoretical finite-sample analysis of the size and 
power properties of autocorrelation robust tests for linear restrictions on the parameters in a linear 
regression model with autocorrelated errors. Being finite-sample results, the findings of the paper 
apply equally well regardless of whether we fancy that the variance estimator being used would 
be consistent or not would sample size go to infinity. Under a mild assumption on the richeness 
of the set of allowed autocorrelation structures in the maintained model, the results in Section [3] 
imply that in most cases the size of common autocorrelation robust tests is 1 or that the worst case 
power is (or both). The richness assumption just mentioned only amounts to requiring that all 
correlation structures corresponding to stationary Gaussian autoregressive processes of order 1 are 
allowed for in the model. Compared to the much wider assumptions on the DGP appearing in the 
literature on autocorrelation robust tests cited above, this certainly is a very mild assumption. [Not 
including all stationary Gaussian autoregressive models of order 1 into the set of feasible disturbance 



processes appears to be an unnatural restriction in a theory of autocorrelation robust tests, cf. also 
the discussion in Subsection 13.1.21 ] A similar negative result is derived for tests that do not use a 
nonparamctric variance estimator but use a variance estimator derived from a parametric model 
as well as for tests based on a feasible generalized least squares estimator (Subsection I3.2|) . We 
also show that the just mentioned negative results hold generically in the sense that, given the 
linear restrictions to be tested, the set of design matrices such that the negative results do not 
apply is a negligible set (Propositions 13751 and |3".15|) . Furthermore, we provide a positive result in 
that we isolate conditions (on the design matrix and on the restrictions to be tested) such that the 
size of the test can be controlled. While this result is obtained under the strong assumption that 
the set of feasible correlation structures coincides with the correlation structures of all stationary 
autoregressive process of order 1, it should be noted that the negative results equally well hold 
under this parametric correlation model. The positive result just mentioned is then used to show 
how for the majority of testing problems autocorrelation robust tests can be adjusted in such a way 
that they do not suffer from the "size equals 1" and the "worst case power equals 0" problem. In 
Section 0] we provide an analogous negative result for heteroskedasticity robust tests and discuss 
why a (nontrivial) positive result is not possible. 

We next discuss some related literature. Problems with tests and confidence sets for the intercept 
in a linear regre ssion model with autoregressive disturbances have been pointed out in Section 5.3 
of lDufouri (J1997I ) (in a somewhat different setup) . These results are specific to testing the intercept 



and do not apply to other linear restrictions. This is, in particular witnessed by our positive results 
for certain testing problems. Furthermore, there is a considerable body of literature concerned 
with the properties of the standard F-test (i.e., the -F-test constructed without any correction 



for au tocor relation) in the presence of autocorrelation, see the references cited in iKramer et al. 



( 19901 ) and iBaneriee and Magnus! ([2000j). Much of this literature concentrates on the case where 



the errors follow a stationary autoregressive process of order 1. As the correlation in the errors is 
not accounted for when considering the standard F-test, it is not too surprising that the standard 
F- test typically s h ows deplorable perfo r mance for large values of the autocorrelation coefficient p, 



see 



Kramerl(|l989l ). Kramer et al.1 (|l990[ ). lBaneriee and Magnusl (|200(J ). and Subsection 13 .31 for more 



discussion. Section [3] of the present paper shows that autocorrelation robust tests, which despite 
having built into them a correction for autocorrelation, exhibit a similarly bad behavior. Finally, in 
a different testing problem (the lead ing case being testing the correlation of the errors in a spatial 
regression model) iMartellosiol ( 20101 ) has studied the power of a class of invariant tests including 



standard tests like the Cliff-Ord test and observed somewhat similar results in that the power of 
the tests considered typically approaches (as the strength of the correlation increases) either or 
1. While his results are similar in s pirit to some of our results, his ar guments are unfortunately 



fraught with a host of problems. See lPreinerstorfer and Potscherl (|2013f ) for discussion, corrections, 
and extensions. 

The results in Section [3] for autocorrelation robust tests and in Section 2] for heteroskedasticity 
robust tests are derived as special cases of a more general theory for size and power properties 
of a larger class of tests that are invariant under a particular group of afhne transformations. 
This theory is provided in Section [5j One of the mechanisms behind the negative results in the 
present paper is a concentration mechanism explained subsequent to Theorem 13.31 and in more 
detail in Subsection l5.21 cf. also Corollary 15. 171 A second mechanism generating negative results is 
described in Theorem l5.19l The theory underlying the positive results mentioned above is provided 
in Subsection 15.31 and in Theorem 15.211 as well as Proposition 15.231 Furthermore, the results in 
Section [5] allow for covariance structures more general than the ones discussed in Sections [3] and |H 



For example, from the results in Section [5] results similar to the ones in Section [3] could be derived 
for heteroskedasticity/autocorrelation robust tests of regression coefficients in spatial regression 
models or in panel data models; for an overview of he terosk edasticity /autoco rrelation robust tests 
in these models see iKelejian and Pruchal (|2007l I2010I ). and IVogelsangl (J2012I ). We do not provide 
any such results for lack of space. W e note th a t for the u ncorrected standard F-te st in this setting 
negative results have been derived in iKramerl ( 2003 ) and iKrmer and Hanckl ( 20091) . 



2 The Hypothesis Testing Framework 

Consider the linear regression model 

Y = X/3 + U, (1) 

where X is a (real) nonstochastic regressor (design) matrix of dimension n x k and f3 £ K fe denotes 
the unknown regression parameter vector. We assume rank(X) = k and 1 < k < n. The n x 
1 disturbance vector U = (ui, . . . , u n )' is normally distributed with mean zero and unknown 
covariance matrix <r 2 E, where < a 2 < oo holds (and a always denotes the positive square 
root). The matrix E varies in a prescribed (nonempty) set £ of symmetric and positive definite 
n x n matrices^] Throughout the paper we make the assumption that £ is such that a 2 and S <G £ 
can be uniquely determined from cr 2 E. [For example, if the first diagonal element of each E 6 £ 
equals 1 this is satisfied; alternatively, if the largest diagonal element or the trace of each E £ £ is 
normalized to a fixed constant, £ has this property] Of course, this assumption entails little loss 
of generality and can, if necessary, always be achieved by a suitable reparameterization of cr 2 £. 

The linear model described above induces a collection of distributions on M. n , the sample space 
of Y. Denoting a Gaussian probability measure with mean \i S R n and covariance matrix cr 2 E by 
-P/x,cr 2 £ an d setting 971 = span(X), the induced collection of distributions is given by 

{P w 2 S :^l,0<(7 2 <« ) ,EeC}. (2) 

Note that each P Mj0 .2 S is absolutely continuous with respect to (w.r.t.) Lebesgue measure on W 1 , 
since every £ £ £ is positive definite by assumption. We consider the problem of testing a linear 
(better: affine) restriction on the parameter vector f3 e B. k , namely the problem of testing the null 
R/3 = r versus the alternative R/3 ^ r, where R is a q x k matrix of rank q and r£R ! . To be more 
precise and to emphasize that the testing problem is in fact a compound one, the testing problem 
needs to be written as 



H : R/3 = r,0 < a 2 < oo, E e £ vs. H x : R/3 ^ r, < a 2 < oo, E e £. 



(3) 



This is important to stress because size and power properties of tests critically depend on nuisance 
parameters and, in particular, on the complexity of £. Define the affine space 



and let 



9Jt = {^ e 971 : n = X/3 and R/3 = r} 
97ti = 97l\97t = {fi e m : /j, = X/3 and R/3 £ r } 



1 Although not expressed in the notation, the elements of Y, X, and U (and even the probability space supporting 
Y and U) may depend on sample size n. Furthermore, the obvious dependence of € on n will also not be shown in 
the notation. [Note that £ depends on n even if it is induced by a covariance model for the entire process (ut)^gjj 
that does not depend on n.] 



Adopting these definitions, the above hypothesis can also be written as 

ff :^e9Jt 0l O<(T 2 <oo,EeC vs. H x : /i £ SDti.O < <r 2 < 00, S € £. (4) 

Two remarks are in order: First, the Gaussiantiy assumption is not really a restriction for the 
negative results in the paper, since they hold a fortiori in any enlarged model that allows not only 
for Gaussian but also for non-Gaussian disturbances. Furthermore, a large portion of the results in 
the paper (positive or negative) continue to hold for certain classes of non-Gaussian distributions 
such as, e.g., elliptical distributions, see Subsection 15.51 Second, if X were allowed to be stochastic 
but independent of U, the results of the paper apply to size and power conditional on X. Because X 
is observable, one could then argue in the spirit of conditional inference (see, e.g., |Robinson| ( 19791 )) 



that conditional size and power and not their unconditional counterparts are the more relevant 
characteristics of a test. 

Recall that a (randomized) test is a Borel-measurable function <p from the sample space R n to 
[0, 1]. If <p = lw, the set W is called the rejection region of the test. As usual, the size of a test 
tp is the supremum over all rejection probabilities under the null hypothesis Hq and thus is given 
by sup Me3 jj sup 0<0 .2 <00 sup Se£ £' /Ji0 -2 S (/3 where E^^y, refers to expectation under the probability 
measure P Mi(T 2 S . 

Throughout the paper we shall always reserve the symbol $(y) for (X'X)~ X'y, where X is the 
design matrix appearing in (jTJ) and y £ R™. Furthermore, random vectors and random variables are 
always written in bold capital and bold lower case letters, respectively. Lebesgue measure on R" 
will be denoted by Ar« , whereas Lebesgue measure on an affine subspace A of R™ (but viewed as a 
measure on the Borel-sets of R") will be denoted by A^, with zero-dimensional Lebesgue measure 
being interpreted as point mass. We shall write int(^4), cl(^4), and bd(A) for the interior, closure 
and boundary of a set A C R™, respectively, taken with respect to the Euclidean topology. The 
Euclidean norm is denoted by ||-||, while d(x,A) denotes the Euclidean distance of the point x £ R n 
to the set A C R™. Let B' denote the transpose of a matrix B and let span(i?) denote the space 
spanned by the columns of B. For a linear subspace C of R" we let £-*- denote its orthogonal 
complement and we let He denote the orthogonal projection on C. For a vector x in Euclidean 
space we define the symbol (x) to denote ±x for x ^ 0, the sign being chosen in such a way that 
the first nonzero component of (x) is positive, and we set (0) = 0. The j-th standard basis vector 
in R" is denoted by €j(n). The set of real matrices of dimension m x n is denoted by R mx ". We 
also introduce the following terminology. 

Definition 2.1. Let £ be a set of symmetric and positive definite n x n matrices. An Z-dimensional 
linear subspace Z of R™ with < / < n is called a concentration space of £, if there exists a sequence 
(S m ) mg N in £, such that E m — > S and span(E) = Z. 

While we shall in the sequel often refer to £ as the covariance model, one should keep in mind that 
the set of all feasible covariance matrices corresponding to © is given byjer 2 !] : < a 2 < oo,£ £ £}. 
In this context we note that two covariance models £ and £* can be equivalent in the sense of giv- 
ing rise to the same set of feasible covariance matrices, but need not have the same concentration 
spaces a 



2 In applying the general results in Section 15.21 or Corollary 15.171 to a particular problem some skill in choosing 
between equivalent £ and £* may thus be required as one choice for £ may lead to more interesting results than does 
another choice. 



Size and Power of Tests of Linear Restrictions in Regres- 
sion Models with Autocorrelated Disturbances 



In this section we investigate size and power properties of autocorrelation robust tests that have 
been designed fo r use in case of stationary dis t urbances. Studie s of the properties of su c h test s 
in the litera t ure jNewev and Westl (Il987l Il994l) [Andrews! (fl99lh. [Andrews and Monaha n (1992), 
Kiefer et al.l (|2000| ), lKiefer and Vogelsang! (|2002aJbl . l2005l) . |janssonl (|2002ll2004) . ISun et al.l (|2008l 
20111) ) maintain assumptions that allow for nonparametric models for the spectral distribution of 
the disturbances. For example, a typical nonparametric model results from assuming that the 
disturbance vector consists of n consecutive elements of a weakly stationary process with spectral 
density equal to 



/M = (27T) 



^Cj-exp(- 



•yw, 



where, for £ > a given number, the coefficients Cj satisfy the summability condition X^o 3 



< 



oo. Here t denotes the imaginary unit. Let 3j denote the collection of all such spectral densities /. 
The corresponding covariance model £^ is then given by {S (/) : / 6 #{} where S (/) is the n x n 
correlation matrix 



£(/) = 



exp (— tu> (i — j)) f(uj)duj 



/(w)dw 



»ii=i 



Certainly, 3^ contains all spectral densities of stationary autoregressive moving average models of 
arbitrary large order. Hence, the following assumption on the covariance model £ that we shall 
impose for most results in this section is very mild and is satisfied by the typical nonparametric 
model allowed for in the above mentioned literature. It certainly covers the case where £ = £^ or 
where £ corresponds to an autoregressive model of order p > 1. 



Assumption 1. 



£ar(i) Q £- 



Here <£-ar(i) denotes the set of correlation matrices corresponding to n successive elements of a 
stationary autoregressive processes of order 1, i.e., £^(1) = {A(p) : p G (—1, 1)} where the (i,j)-th 
entry in the n x n matrix A(p) is given by p' 1- - 7 '. As hinted at in the introduction, parameter 
values (/x, a 2 , S) with £ — A(p) where p gets close to ±1 and a 2 is constant will play an important 
role as they will be instrumental for establishing the bad size and power properties of the tests 
presented below. We want to stress here that, as p — > ±1, the corresponding stationary process 
does not converge to an integrated process but rather to a harmonic process. This is so because 
the variance of the disturbances a 2 is held constant. [If we parameterized in terms of p and the 
innovation variance a 2 = a 2 (l — p 2 ), this would correspond to a 2 — > at the appropriate rate.] 

For later use we note that under Assumption [1] the matrices e + e' + and e_e'_ are limiting points 
of the covariance model £ where e+ = (1, . . . , 1)' and e_ = (—1, 1, . . . , ( — 1)™)' are n x 1 vectors 



(since A(p m ) converges to e + e' + (e_e'_, respectively) if p m — > 1 (p n 



T, respectively)). Other 



singular limiting points of £ are possible, but e + e' + and e_e'_ are the only singular limiting points 

of £ar(i)- 

The next subsection presents the results for common nonparametrically based autocorrelation 
robust tests whereas parametrically based tests are the subject of Subsection [ 



3.1 Nonparametrically based autocorrelation robust tests 

Commonly used autocorrelation robust tests for the null hypothesis Ho given by ([3]) are based 
on test statistics of the form (Rf3(y) — r)'^" 1 (y) (Rf3(y) — r), with the statistic typically being 
undefined if Q (y) is singular. Here 



Cl{y) = nR{X'X)- l i>{y){X'X)- l R' 



(5) 



and \^ is a nonparametric estimator for n 1 E(X'UU / X). The type of estimator ^> we consider in 
this subsection is obtained as a weighted sum of sample autocovariances of v t (y) — u t (y)x' t . 1 where 
ut(y) is the i-th coordinate of the least squares residual vector u{y) = y — Xf3(jj) and Xt- denotes 
the t-th row vector of X. That is 



j=-(n-l) 



(6) 



for every y 6 R" with t,(y) = n^Y^^+x^M^t-Av)' if J > and f, (y) = f_,(y)' else. The 
associated estimator f2 will be denoted by Cl w . We make the following assumption on the weights. 

Assumption 2. The weights w(j, n) for j = — (n — 1), . . . , n — 1 are data-independent and satisfy 
w(0, n) = 1 as well as w {—j, n) — w (j, n). Furthermore, the symmetric n x n Toeplitz matrix W n 
with elements w (i — j, n) is positive definite}^ 

The positive definiteness assumption on W n is weaker than the frequently employed assumption 
that the Fourier transform w\ (w) of the weights is nonnegative for all ui e [— 7r,7r]o It certainly 
implies that ty w (y), and hence fl w (y), is always nonnegative definite, but it will allow us to show 
more, see Lemma l3. II below. In many applications the weights take the form w(j, n) = wo (\j\/M n ), 
where the lag- window wq is an even function with u>o(0) = 1 and where M n > is a truncation 
lag (bandwidth) parameter. In this case the first part of the above as sumption means t hat w e 
are considerin g deterministic bandwidths only (as is the case, e.g., in Newev and West (119 87), 



Sections 3-5 of JAndrewd (|l99lh . Illansenl (|l992h . JKiefer and Vogelsang! (J2002bl . 120051 ). and 



Janssonl 



([2002, 120041 )'). Extensions of the res ults in this subsectio n to data-dependent bandwidth choices 
and prewhitening will be discussed in IPreinerstorferl (J2013I ). Assumption [2] is known to be satisfied, 
e.g., for the (modified) Bartlett, Parzen, or the Quadratic Spectral lag-wind ow, but is no t sati sfied, 
e.g., for the rectangular lag- window (with M n > 1)0 See lAndersonl ( 19711 ) or lHannan (119701) for 
more discus sion. It is also s atisfie d for many exponentiated lag- windows as used in I Phillips et al.l 
( 200(1120071 ) and lSunet aD (J201ll ). 

In the typical asymptotic analysis of this sort of tests in the literature the event where the 
estimator Cl w is singular is asymptotically negligible (as Cl w converges to a positive definite or almost 
surely positive definite matrix), and hence there is no need to be specific about the definition of 
the test statistic on this event. However, if one is concerned with finite-sample properties, one has 



3 For the case where W n is only nonnegative definite see Subsection 13 . 1 . il 

I I 2 

Note that the quadratic form ct'WnCt can be represented as f_^ E?=i a j CX P ( t i a; ) ^t (^ &*>■ If w \ (u) > 

for all u> £ [— 7T, 7r] is assumed, the integrand is nonnegative; and if a ^ it is positive almost everywhere (since it is 
then a product of two nontrivial tri g onom etric polynomials). 

5 The estimator in Ke ener et al.l l|f99ll ) coincides with (n times) the estimator given by 1(5} if the rectangular 
lag- window is used and R = 1^. 



to think about the definition of the test statistic also in the case where Cl w (y) is singular. We thus 
define the test statistic as followsO 

T(v) = / (i? ^ (y) " r) '"- 1 (y) iR P {y) ~ r) if d6t hw {y) * °' (7) 

Ky> \ if det Cl w (y) = 0. { ' 

Of course, assigning the test statistic T the value zero on the set where Cl w (y) is singular is arbitrary. 
However, it will be irrelevant for size and power properties of the test provided we can ensure that 
the set of y G R" for which det Cl w (y) = holds is a Ann-null set (since all relevant distributions 
-P/j,ct 2 s are absolutely continuous w.r.t. Aru due to the fact that every element of E G £ is positive 
definite by assumption). We thus need to study under which circumstances this is ensured. This 
will be done in the subsequent lemma. It will prove useful to introduce the following matrix for 
every y 6 R™ 

B{y) = J R(X'X)- 1 X'diag(u 1 ( 2 /),...,u„(y)) 

= RiX'X^X' diag (ei(n)n span(x) it/, . . . , e' n {n)U span(x) ±y) , (8) 

as well as the following assumption on the design matrix X (and on the restriction matrix R): 

Assumption 3. Let 1 < ii < .. . < i s < n denote all the indices for which ej(n) G span(X) holds 
where ej(n) denotes the j-th standard basis vector in R™. // no such index exists, set s — 0. Let 
X' ( _ >(g'i, ■ • • i s )) denote the matrix which is obtained from X 1 by deleting all columns with indices ij, 
1 < i\ < . . . < i s < n (if s = no column is deleted). Then rank (R(X'X)~ 1 X I (~<(ii, ■ ■ ■ is))) = q 
holds. 

The lemma is now as follows. Note that the matrix B (y) does not depend on the weights 
w(j,n). 

Lemma 3.1. Suppose Assumption^ is satisfied. Then the following holds: 

1. Q w (y) is nonnegative definite for every y £ R™. 

2. Q w (y) is singular if and only if rank (B(y)) < q. 

3. Q w (y) — if and only if B(y) = 0. 

4- The set of all y € R™ for which Q w (y) is singular (or, equivalently, for which rank (B(y)) < q) 
is either a X^n-null set or the entire sample space R n . The latter occurs if and only if 
Assumption^ is violated. 

Remark 3.2. (i) Setting R — X'X and q — k shows that a necessary and sufficient condition for 
^ w to be Am™ -almost everywhere nonsingular is that ei(n) £ span(X) for all i = 1, . . . ,n. [If this 
condition is not satisfied ^ w (y) is singular for every y e R™.] In particular, it follows that under 
this simple condition £l w (y) is nonsingular Aru -almost everywhere for every choice of the restriction 
matrix R. 

(ii) In the case q — 1 Assumption [3] is easily seen to be violated if and only if 



R(X'X) 1 X'e l (n) = or e t (n) £ span(Af) holds for every i = 1, 



,,n. 



6 Some authors (e.g., lKiefer and Vogelsangl (2002b. 2005)) choose to normalize also by q, the number of restrictions 
to be tested. This is of course immaterial as long as one accordingly adjusts the critical vlaue. 



We learn from the preceding lemma that, provided Assumption[3Jis satisfied (which only depends 
on X and R and hence can be verified by the user) , our choice of defining the test statistic T to be 
zero on the set where Q w is singular is immaterial and has no effect on the size and power properties 
of the test. We also learn from that lemma that, in case Assumption [3] is violated, the commonly 
used autocorrelation robust tests break down completely in a trivial way as Cl w (y) is then singular 
for every data point y. We are therefore forced to impose Assumption [3] on the design matrix X 
if we want commonly used autocorrelation robust tests to make any sense at all. We shall thus 
impose Assumption [3] in the following development. We also note that, given a restriction matrix 
R, the set of design matrices that lead to a violation of Assumption [3] is a "thin" subset in the set 
of all n x k matrices of full rank. 

As usual, the test based on T rejects H HT(y) > C where C > is an appropriate critical value. 
In applications the critical value is usually taken from the asymptotic distribution of T (obtained 
cither under assumptions that guarantee consistency of Cl w or under the assumption of a "fixed 
bandwidth", i.e., M n /n > independent of n). In the subsequent theorem, which discusses size and 
power properties of autocorrelation robust tests based on T, we allow for arbitrary (nonrandom) 
critical values C > 0J3 Because of this, and since the theorem is a finite-sample result, it applies 
equally well to standard autocorrelation robust tests (for which one fancies that M n — > oo and 
M n /n — > if n would increase to infinity) and to so-called "fixed-bandwidth" tests (which assume 
M n /n > independent of n). 

Theorem 3.3. Suppose Assumptions]]^ and\3[ are satisfied. Let T be the test statistic defined 
in Q) with ty w as in (0). Let W(C) = {y £ 1™ : T(y) > C} be the rejection region where C is a 
real number satisfying < C < oo. Then the following holds: 

1. Suppose rank(_B(e + )) = q and T(e+ + (Iq) > C hold for some (and hence all) /Zg G OTo, or 
rank (B(e_)) = q and T(e_ + /Xg) > C hold for some (and hence all) /ig* 6 9Jtg. Then 

supP Mo , aaE (W(C)) = l 

sec 

holds for every /i € 97lg and every < a 2 < oo. In particular, the size of the test is equal to 
one. 

2. Suppose rank (B(e_|-)) = q and T(e+ + /Xg) < C hold for some (and hence all) /j,q S 2ftg, or 
rank (B(e_)) = q and T(e_ + /ig) < C hold for some (and hence all) /Xq g DJl . Then 

mf.P Mo , CT2s (^(C))=0 
holds for every /i e 9JTg and every < a 2 < oo, and hence 

holds for every < a 2 < oo. In particular, the test is biased. Furthermore, the nuisance- 
infimal rejection probability at every point [i x £ 9Jli is zero, i.e., 

0<tT 2 <oo See. 

In particular, the infimal power of the test is equal to zero. 



7 Bccause the theorem is a finite-sample result, we are free to imagine that C depends on sample size n. In fact, 
there is nothing in the theory that prohibits us from imagining that C depends even on the design matrix X, on the 
restriction given by (R,r), or on the weights w(j,n). 



3. Suppose B{e+) = and i?/3(e+) ^ hold, or -B(e_) = and i?/3(e_) ^ hold. Then 

supP MoiCT2s (W(C)) = l 
sec 

ZioMs /or every p £ 9Ho a ^ every < o""" < oo. In particular, the size of the test is equal to 
one. 

Remark 3.4. (i) As a point of interest we note that the rejection probabilities P MjCT 2 S (W "(C)) can be 
shown to depend on (/it, a 2 , E) only through ((i?/3 — r) /ex, E) (in fact, only through (({R/3 — r) /a) , E)), 
see Lemma IA.1I in Appendix [A] 

(ii) Although trivial, it is useful to note that the conclusions of the preceding theorem also apply 
to any rejection region W* £ £>(R") which differs from W (C) by a AR»>-null set. 

(iii) By the way T is defined in 0, the condition T(e+ + Hq) > C (T(e_ + (j£) > C, respec- 
tively) in Part 1 of the preceding theorem already implies rank(B(e+)) = q (rank (B(e_)) = q, 
respectively). For reasons of comparability with Part 2 we have nevertheless included this rank 
condition into the formulation of Part 1. 

(iv) Inspection of the proof of Theorem [33] shows that Assumption [T] can obviously be weakened 
to the assumption that £ contains AR(1) correlation matrices A(p TO ) and K{pm ) for two sequences 
Pm £ (— 1) 1) with pin — > 1 and p m -^ — 1. In fact, this can be further weakened to the assumption 
that there exist S™ £ £ with E„ — s> e+e' + and Em — > e_e'_ for m — > oo. 

The conditions in Parts 1-3 of the theorem only depend on the design matrix X, the restriction 
(R,r), the vector e+ (e_, respectively), the critical value C, and the weights w (j,n) (via T(e++/Zg) 
or T(e_ + /Xq), respectively). Hence, in any particular application it can be decided whether 
(and which of) these conditions are satisfied. Furthermore, as will become transparent from the 
examples to follow and from Proposition 13.51 below, in the majority of applications at least one of 
these conditions will be satisfied, implying that common autocorrelation robust tests have size 1 
and/or have power arbitrarily close to in certain parts of the alternative hypothesis. Before we 
turn to these examples, we want to provide some intuition for Theorem 13.31 Consider a sequence 
p m G (—1)1) w hh p m — > 1 (p m — > — 1, respectively) as m — > oo. Then E m = A(p m ) £ £ by 
Assumption Q] and A(p m ) — > e + e' + (e_e'_) holds. Consequently, P l _ Lot a 2 T, m concentrates more and 
more around the one-dimensional subspace span(e+) (span(e_), respectively) in the sense that 
it converges weakly to the singular Gaussian distribution P^ a 2 e e i (P„ iCr 2 e _ e / , respectively). 
The conditions in Part 1 (or Part 3) of the preceding theorem then essentially allow one to show 
that (i) the measure P M ,a 2 e + e' (Pfi ,cr 2 e_e' , respectively) is supported by W (C) (more precisely, 
after W (C) has been modified by a suitable AR^-null set), and (ii) that P Moi<T 2 e+e / (-P Moi<T 2 e _ e /_, 
respectively) puts no mass on the boundary of (the modified) set W (C). By the Portmanteau 
theorem we can then conclude that the sequence of measures P^ i(7 2 's m concentrates more and more 
on W (C) in the sense that P^ , a 2 Y, m {W {C)) — > 1 as m — » oo, which establishes the conclusion 
of Part 1 of the theorem. The proof of the first claim in Part 2 works along similar lines but 
where concentration is now on the complement of the rejection region W (C). For more discussion 
see Subsection 15.21 The remaining results in Part 2 are obtained from the first claim in Part 2 
exploiting invariance and continuity properties of the rejection probabilities. While concentration 
of the probability measures P^ CT 2 Sm constitutes an important ingredient in the proof of Theorem 
it should, however, be stressed that there are also other cases (cf. Theorems l3. Gl and 13. 7[) , where 
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despite concentration of P M ,a 2 T, m as above, the conditions for an application of the Portmanteau 
theorem are not satisfied; in fact, in some of these cases size < 1 and infimal power > can be 
shown. 

We now consider a few examples that illustrate the implications of the preceding theorem. As 
in most applications the regression model contains an intercept, we concentrate on this case in the 
examples. 

Example 3.1. (Testing a restriction involving the intercept) Suppose that Assumptions [I] [21 and[3] 
hold. For definiteness assume that the first column of A corresponds to the intercept (i.e., the first 
column of A is e+). Assume also that the restriction involves the intercept, i.e., the first column 
of R is nonzero. Then it is easy to see that B (e + ) = and R$(e+) ^ holds (the latter since 
/8(e+) = ei (k)). Consequently, Part 3 of Theorem 13.31 applies and shows that the size of the test T 
is always 1 . Additionally, the power deficiency results in Part 2 of the theorem will apply whenever 
rank (B(e_)) = q and T(e_ + /xq) < C hold. [Whether or not this is the case will depend on C, A, 
-R, and the weights.] 

Example 3.2. (Location model) Suppose that Assumptions Q] and [2] hold. Suppose A = e+ and 
the hypothesis is f3 — /3 (hence k = q = 1). As just noted in Example 13. 1( the size of the test T is 
then always 1 (as Assumption [3] is certainly satisfied). In this simple model the conditions for the 
power deficiencies to arise can be made more explicit: Note that B{e~) ^ clearly always holds, 
and hence rankB(e_) = 1 = q. If n is even, it is also easy to see that T(e_ + f3 e + ) = < C 
always holds. Consequently, Part 2 of Theorem 13.31 applies and shows that the power of the test 
gets arbitrarily close to zero in certain parts of the parameter space as described in the theorem. If 
n is odd, then T(e_ +/3 e + ) = ri~ ^~ (e_) and the same conclusion applies provided this quantity 
is less than C . For example, for the (modified) Bartlett lag-window numerical computations show 
that n _1 \E'~ 1 (e_) is less than 1.563 for every odd n in the range 1 < n < 1000 and every choice of 
M n /n € (0, 1]; hence, if C has been chosen to be larger than or equal to 1.563, which is typically 
the case at conventional nominal significance levels, the power deficiencies are also guaranteed to 
arise. We note here that this simple location model is often used in Monte Carlo studies that 
try to assess finite-sample properties of autocorrelation robust tests. Furthermore, autocorrelation 
robust te sting of the location p a rameter plays an i mport ant role in the field of simulation, see, e.g., 
iHeidelberger and Welch] (|l98lh . [Flegal and Jones] (J2010l ). 



Example 3.3. (Testing a zero restriction on a slope parameter) Consider the same regression model 
as in Example 13.11 with the same assumptions, but now suppose that the hypothesis is (i i — for 
some % > 1, i.e., we are interested in testing a slope parameter. Since in this case -B(e+) = and 
R$(e+) = obviously hold, where R = e\ (fc), we need to investigate the behavior of B(e_) in order 
to be able to apply Theorem 13. 31 If ranki?(e_) = 1 holds (which will generically be the case) then 
size equals 1 in case T(e_) > C and the power deficiencies arise in case T(e_) < C. 

Example 3.4. (Testing for a change in mean) A special case of the preceding example is the case 
where k = 2, the first column of A is e+ and the second column has entries x t 2 = for 1 < t < t* 
and Xt2 = 1 else. We assume £* to be known and to satisfy 1 < t* < n. The hypothesis to be 
tested is j3 2 = 0. It is then easy to see that Assumption [3] is satisfied. Furthermore, some simple 
computations show that rank_B(e_) = q = 1 always holds. Hence, the test T has size 1 if T(e_) > C 
and the power deficiencies arise if T(e_) < C. In case n as well as n — to are even, the latter case 
always arises since T{eJ) = holds. [If n or n — to is odd, T(e_) can of course be computed and 
depends only on n, to, and , I'~ 1 (e_). We omit the details.] 
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The cases in Theorem 13.31 leading to size 1 or to power deficiencies of the test based on T, while 
not being exhaustive, are often satisfied in applications. We make this formal in the subsequent 
proposition in that we prove that, for given restriction (R, r) and critical value C, the conditions 
in Theorem 13.31 involving X are generically satisfied. The first part of the proposition shows that 
these conditions are generically satisfied in the universe of all possible n x k design matrices of rank 
k. Parts 2 and 3 show that the same is true if we impose that the regression model has to contain 
an intercept. In the subsequent proposition the dependence of B (y), of T(y), as well as of tl w (y) 
on X will be important and thus we shall write Bx (y), Tx (y), and Q w x (y) for these quantities 
in the result to follow. 

Proposition 3.5. Suppose Assumption^ holds. Fix (R,r) with rank(i?) = q, fix < C < oo, and 

fix the weights w(j,n) which are assumed to satisfy Assumption^ Let T be the test statistic defined 
in 1% with $> w as in {S|) and let (j,q € DJIq be arbitrary. 

1. Define 

X = {X <= R nxk : rank(X) = fc} , Xi (e+) = {X e X : rank (B x (e+)) < q} , 

X2 (e+) = {X e X \Xi (e+) : Tx{e+ + Mo) = C} > 

and similarly define Xi (e_), X2(e_). [Note that X2 (e+) and X2 (e_) do not depend on 
the choice of fi^.J Then Xi (e + ), X2 (e + ), Xi (e_), and X2 (e_) are \ Rn xk-null sets. The 
set of all design matrices X £ Xo for which Theorem \3.3\ does not apply is a subset of 
(Xi (e_|_) U X2 (e+)) n (Xi (e_) U X2 (e_)) and hence is a \ Rn xk-null set. It thus is a "neg- 
ligible" subset 0/X0 in view of the fact that Xo differs from M. nxk only by a X^nxk-null set. 

2. Suppose k > 2, X has e+ as its first column, i.e., X = I e+, X ) , and suppose the first column 
of R consists of zeros only. Define 

X = {ierf") :mnk((e+,x)) = k\ , 
Xi(e-) = {! 6X : rank (B (e+!;f) ( e _)) < q} , 
X 2 (e_) = {i6fo\i 1 (e-):T (e+ x ) (e. + Mo) = c}, 

and note that X2 (e_) does not depend on the choice of /1q. Then Xi (e_) and X2 (e~) are 
Agnx(k-i) -null sets (with the analogously defined sets Xi (e+) and X2 (e + ) satisfying Xi (e + ) = 
Xo and X2 (e + ) = %.). The set of all matrices Jsl n such that Theorem \3.3\ does not apply to 

the design matrix X = I e+, X) is a subset of Xi (e_) U X2 (e_) and hence is a A RnX (fc-i) -null 

set. It thus is a "negligible" subset of Xo in view of the fact that Xo differs from R nx ( fe_1 ) 
only by a Ajjnxt&s-i) -null set. 



3. Suppose k > 2, X = [e + ,X\, and suppose the first column of R is nonzero. Then Theo- 

rem \3.3\ applies to the design matrix X — ie + ,X\ for every X £ Xo (provided X satisfies 
Assumptions^ 



3 If X does not satisfy Assumption [3] then the test breaks down in a trivial way as already discussed. 
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The proof of the proposition actually shows more, namely that the set of design matrices for 
which Theorem 13.31 does not apply is contained in an algebraic set. We also remark that if the 
regressor matrix X is viewed as randomly drawn from a distribution that is absolutely continuous 
w.r.t. A R nxfc, Proposition 13.51 implies that then the conditions of Theorem 13.31 are almost surely 
satisfied; if X is also independent of U, Theorem 13.31 then establishes negative results for the 
conditional rejection probabilities for almost all realizations of X. 

We next discuss an exceptional case to which Theorem l3.3l does not apply and which is interesting 
in that a positive result can be established, at least if the covariance model €, is assumed to be 
$-AR{i) or is approximated by &ar{i) near the singular points in the sense of Remark 13.91 below. 
This positive result will then guide us to an improved version of the test statistic T. 

Theorem 3.6. Suppose £ = €-ar(i) an d suppose Assumptions^ and\3\ are satisfied. Let T be the 
test statistic defined in |?J) with ^ w as in (0|). Let W{C) = {y G W l : T(y) > C} be the rejection 
region where C is a real number satisfying < C < oo. 7/e + ,e_ G 9Jt and R/3(e+) — i?/3(e_) = 
is satisfied, then the following holds: 

1. The size of the rejection region W(C) is strictly less than 1, i.e., 

sup sup sup P MQ . (T 2 A(p) (W(C)) < 1. 

/i SOT 0<<r 2 <oo -Kp<l 

Furthermore, 

inf inf inf P u a 2 Mp) (W(C)) > 0. 

2. The infimal power is bounded away from zero, i.e., 

inf inf inf R. a 2 Mp) (W(C)) > 0. 

3. For every < c < oo 

inf P^ K{Pm) {W(C))^l 

p 1 eOTi,0<o- 2 <oo 
d(Mi,OT )/o->c 

holds for m — >• oo and for any sequence p m G (—1,1) satisfying \p m \ — > 1. Furthermore, for 
every sequence < c m < oo and every < e < 1 

2& .4.^0 p ^.^A(rt(w(c)) -► 1 

fi^aHi, — l+e<p<l — e 
d( Ml ,OT )>c m 

holds for m — > oo whenever < <r^ < oo and c m /a m — ¥ oo. [The very last statement holds 
even without the conditions e+,e_ G OJl and R/3(e+) = i?/3(e_) = 0./ 

4. -for every S, < S < 1, there exists a C(6), < C(<5) < 00, smc/i i/iai 

sup sup sup P Mo . <T 2 A(p) (W / (C(5))) <<5. 

p GOTo 0<ct 2 <oc -Kp<l 
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The first statement of the theorem says that, in contrast to the cases considered in Theorem 
13.31 the size of the test T is now bounded away from 1 for any choice of the critical value C. 
Moreover, the last part of the theorem shows that the size can be controlled to be less than or 
equal to any prespecified significance level S by a suitable choice of the critical value C(S). Because 
P^ ,a 2 A( P )(W(C)) does not depend on ^ and a 2 but only on p (see Proposition 15 .4j) and because 
this probability can be computed via simulation, the supremum of this probability over p , a 2 , and 
p can be easily found by a grid search; exploiting monotonicity of the probability with respect to 
C, the value of C(5) can then be found by a simple search algorithm. The theorem furthermore 
shows that, again in contrast to the scenario considered in Theorem 13.31 the infimal power of the 
test is at least bounded away from zero. The power even approaches 1 if either [R0^ ' — r) /a 

is bounded away from zero and |p| — > 1, or if I Rj3^ ' — r) /a — y 00 and \p\ is bounded away from 
1. [Here (3 ' is the parameter vector corresponding to p, 1 . Note that d(/i 1 ,97to) is bounded from 



Rf3 



(i) 



where the constants involved are positive 



above as well as from below by multiples of 
and depend only on X, R, and r] 

The preceding theorem required e+,e_ G 971 and i?/3(e+) = i?/3(e_) = 0. To illustrate, these 
conditions are, e.g., satisfied if e+ and e_ constitute the first two columns of the matrix X and the 
hypothesis tested only involves coefficients f3 t with i > 3 (i.e., the first two columns of R are zero). 
While an intercept will typically be present in a regression model and thus e+ appears as one of the 
regressors (and hence satisfies e+ £ 971), e_ will not necessarily be an element of 971, and hence the 
preceding theorem will not apply. However, the following theorem shows how we can nevertheless 
extend the same positive results to this case if we apply a simple adjustment to the test statistic T. 

Theorem 3.7. Suppose £ = £ar(i) and suppose Assumption^ is satisfied. Suppose one of the 
following scenarios applies: 

1. e + £ 971 with i?/3(e+) = and e_ ^ 971. Furthermore, fc + 1 < n holds and the n x (fc + 1) 
matrix X — (X, e_) (which necessarily has rank k + 1) satisfies Assumption^ relative to the 
q x (k + 1) restriction matrix R = (i?,0). Define /? (y) = (4,0) (X'X)' 1 X'y. 

2. e + ^ 97t and e_ G 971 with i?/3(e_) = 0. Furthermore, k + 1 < n holds and the n x (k + 1) 
matrix X — (X, e + ) (which necessarily has rank k + 1) satisfies Assumption^ relative to the 
qx(k + l) restriction matrix R = (i?,0). Define (3 (y) = (I k , 0) (X'X)^ 1 X'y. 

3. e + (£_ 97T and e_ ^ 971 with rank(X, e + , e_) = k + 2. Furthermore, k + 2 < n holds and 
the n x (k + 2) matrix X = (X, e + ,e_) (which necessarily has rank k + 2) satisfies As- 
sumption^ relative to the q x [k + 2) restriction matrix R = (R, 0,0). Define /? (y) = 
(7 fc ,0,0)(X'X) _1 XV 

4- e+ (f. 971 and e_ ^ 971 wrai/i rank (X, e+, e_) = fc+ 1. Furthermore, k+1 < n holds and the n x 
(fc + 1) matrix X = (X, e+) (which necessarily has rank k+l) satisfies Assumption^ relative 
to the q x (k + 1) restriction matrix R = (R, 0). Suppose further that R yX'Xj X'e- = 
froto. Define [3 (y) = (4,0) (X'X^X'y. 

5. e + ^ 971 anrf e_ ^ 971 wii/i rank (X, e+, e_) = fc + 1. Furthermore, k + l < n holds and the n x 
(fc + 1) matrix X = (X, eJ) (which necessarily has rank k + l) satisfies Assumption^ relative 
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to the q x (k + 1) restriction matrix R — (R, 0). Suppose further that R [X 1 X) X'e+ = 
holds. Define (3 (y) = (I k ,0) (X'Xy 1 X'y. 

In all five scenarios define 

(R0(y) - r)'Sl-\y){RP{y) - r) if det 0. (y) £ 0, 



T ^ t if detn w (y) = 0, 

where Q w (y) = nR(X' X) ty w (y)(X'X) R' , and ^ w (y) is computed from ^ based on Vt(y) = 
u t (y)x' t . instead of v t {y). Here u t {y) are the residuals from the regression of y on X, and x t . are 
the rows of X . Let W(C) = {y £ W 1 : T(y) > C\ be the rejection region where C is a real number 
satisfying < C < oo. Then for each of the five scenarios the conclusions of Theorem VS. 61 hold with 
W{C) replaced byW(C). 



Theorem 13.31 together with Proposition 13.51 has shown that generically the commonly used test 
based on the statistic T has severe size or power deficiencies even for £ = £a_r(i): while Theorem l3.6l 
has isolated a special case where this is not so. Theorem 13.71 now shows that in many of the cases 
falling under the wrath of Theorem 13.31 the ensuing problems can be circumvented (if C = £a_r(i)) 
by making use of the adjusted version T of the test statistic. The adjustment mechanism is simple 
and amounts to basing the test statistic on estimators j3 and fl w that are obtained from a "working 
model" that always adds the regressors e + and/or e_ to the design matrix. Note that these 
regressors effect a purging of the residuals from harmonic components of angular frequency and 
7r. This purging effect together with the fact that the restrictions to be tested do not involve the 
coefficients of the "purging" regressors e+ and e_ lies at the heart of the positive results expressed 
in Theorems I3.6l and l3. 71 Numerical results that will be presented elsewhere support the theoretical 
result and show that the adjusted test based on T considerably improves over the unadjusted one 
based on T. 

Remark 3.8. (i) Suppose the scenario in Part 1 of the above theorem applies except that k + 1 = n 
holds or X — (X, e_) docs not satisfy Assumption [3] Then the test statistic T is identically zero 
and the adjustment procedure does not work. A similar remark applies to Parts 2-5. 

(ii) Suppose the scenario of Part 4 of the above theorem applies except that R (X'Xj X'e- ^ 
holds. Applying Part 3 of Theorem l3.3l to T shows that this test has size 1 and hence the adjustment 
procedure fails. A similar comment applies to the scenario of Part 5. 

Remark 3.9. The results in Theorems 13.61 and 13.71 have assumed €. = <£ar(i)- The results im- 
mediately extend to other covariance models £ as long as £ is norm-bounded, the only singular 
accumulation points of £ are e+e' + and e_e'_, and for every S m 6 £ converging to one of these 
limit points there exists a sequence (p m ) m eN in (— 1, 1) such that A( / o m )~ 1 / 2 E rn A(yO m )~ 1/ ' 2 — > I n for 
m — >• oo (that is, near the "singular boundary" the covariance model £ behaves similar to £^(1)). 
This can be seen from an inspection of the proof. An extension of Theorems 13.61 and 13.71 to even 
more general covariance models will be discussed elsewhere. 

3.1.1 Alternative nonparametric estimators for the variance covariance matrix 

We next discuss test statistics of the form (J7|) that use estimators other than ^ w . 
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A. (General quadratic estimators based on vt) The estimator tff w given by © is a special case 
of general quadratic estimators ^gq (v) of the form 

I! 

^gq{v)= ^2 w(t,s;n)v t (y)v s (yY 

t,B = l 

for every y G R n , where the n x n weighting matrix W* = (w {t, s; n)) t is symmetric and data- 
independent. While estimators of this more general form have been studied in the early literature on 
spectral estimation, much of the literature has focused on the special case of weighted autocovariance 



estimators of the form $ w (partly as a consequence of a result in lGrenander and Rosenblatt! (J1957T ) 
that the restriction to the smaller class of estimators does not lead to inferior estimators in an 
asymptotic sense). However, if the data are preprocessed by tapering before an estimator like $F W 
is computed from the tapered data, the final estimator belongs to the class of general quadratic 
estimators. Al so, many moder n spectral estimators studied in the engineering literature fall into 



this class (see iThomsonl (1982)), but not into the more narr ow class of we ighte d autocovar iance 



estimators. Another example are the estimators proposed in IPhillipsI (|2005h and ISunl ( 20121 ). We 
now distinguish two cases: 

Case 1: The weighting matrix W* = (w (t, s; n)) t s is positive definite. Inspection of the proofs 

then shows that all results given above for the tests T based on ^ w remain valid as they stand if 
$ w is replaced by ^>gq in the definition of the test statistic. 

Case 2: The weighting matrix W* = {w (t . s;n)) t is onl y ass umed to be nonnegative definite 
(as is the case for the estimators considered in lPhillipsi (|2005l ) and ISunl (J2012I )). Arguing similar as 
in the proof of Lemma 13.11 one can show the following: 

Lemma 3.10. Suppose W* = (w (t, s; n)) t is nonnegative definite and define 

Ci GQ (y) = nRix'xy^GQiyKx'xy'R'. 

Then the following hold: 

1. &GQ (y) is nonnegative definite for every y G R ra . 

2. CIgq (y) is singular if and only if rank {B{y)W^) < q (or, equivalently, z/rank I B(y)Wn ) < 

a)- 

3. Cl GQ (y) = if and only if B(y)W* = (or, equivalently, if B(y)Wn /2 = 0). 

4- The set of all y £ R™ for which flcQ (y) is singular (or, equivalently, for which rank (B(y)W*) < 
q) is either a X^n-null set or the entire sample space R". 

As a consequence we see that two cases can arise: In the first case Clco. (y) is singular for all 
y G R™, in which case the test statistic T breaks down in a trivial way. Note that this arises 
precisely if and only if rank (_B(y)W*) < q for all y G R™, which is a condition solely on the design 
matrix X, the restriction matrix R, and the weighting matrix W*, and thus can be verified in any 
particular application. Now suppose that the second case arises, i.e., rank(i?(y)W*) = q for Ar™- 
almost all y. Then inspection of the proofs shows that Theorems l3.3l and l3.6l continue to hold for the 
test statistic T based on ^gq provided Assumption [3] is replaced by the just mentioned condition 
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rank (i3(y)W*) = q for A^-almost all y, and the matrix B{y) in those theorems is replaced by 
B(y)W* l . Also Theorem 13.71 generalizes with the obvious changes. 



B. (An estimator based on u) Because n 1 E(X'UU'X) = n 1 X'E,(UXJ')X, a natural estimator 



is 



* E {y)=n- 1 X'K(y)X 

for every y G R™, where K (y) is the symmetric n x n Toeplitz matrix with block elements 
n_1 Y^i=j+\ &i (y) ui-j (y) i n the j-th diagonal above the main diagonal. This estimator has already 



been discussed in [Eickerl (J1967I ) , but does not seem to have been used much in the econometrics 



literature. It is not difficult to see that ^e (y) is always nonnegative definite. It is positive defi- 
nite if and only if y ^ span (A); and it is equal to zero for y G span (A). Define the statistic Tg 
via ([7]) with Cl w (y) replaced by CIe (y) where the latter is obtained from ([5]) by replacing $ w (y) 
by ^e (y)- It is then easy to see that Theorems 13.31 13.61 and 13.71 carry over to the test based 
on Te provided Assumption [3] is deleted from the formulation, the condition rank(B(e + )) = q 
(rank(i?(e_)) = q, respectively) is replaced by e + ^ span (A) (e_ ^ span (A)), and the co ndition 
B(e+) = (B(e-) = 0, respectively) is replaced by e + G span (A) (e_ G span (A)). [While Eicker 



<|1967l ) provided conditions on the regressors under which consi stency of ^e (v) results, it may not 
be consistent for some common forms of regressors (as noted in lEickerl ( 19671 )). Therefore one may 



want to replace K (y) by a variant where the empirical second moments are downweighted (or more 
generally are obtained from an estimate of the spectral density of the errors u t ). Similar results 
can then be obtained for this variant of the test. We omit the details.] 

C. (Data-driven bandwidth, prewhitening, flat-top kernels, autoregressive estimates, random 
critical values) Tests based on weighted autocovariance estimators $ but where the weights are 
allowed to depend on the data (e.g., lag- window estimators with data-driven bandwidth choice or 
exponentiated lag-window estimators with data-driven choice of the exponent), or where prewhiten- 
ing is used, or where the esti mators V fr may not always be nonnegative definite (as is the cas e with 
so-called flat-top kernels, see IPolitia ( 20111 )). are discussed in detail in IPreinerstorferl ( 2013 ). Like 



the results given above, they are obtained as special cases of very general results provided in Sub- 
section 15.41 The results in Subsection 15.41 essentially rely only on a certain equivariance property 
of the estimator Cl and allow for cases where Cl is not always nonnegative definite. These results 
also accommodate situations where the estimator $7 (y) is only well-defined for Ar™ -almost all y, 
a case that arises often when data-driven bandwidth or prewithening are employed. Furthermore, 
certain cases where the critical value is allowed to be random are covered by these results, see 
Remark I5.16f ii) in Section 15.41 (and this is also true for Subsection 13.21 and Section [4j . Finally, 
tests based on estimators vfr obt ained from parametric m odels like vector autoregressions (see, e.g., 
den Haan and Levinl ( 1997 ) or ISun and Kaplan! ( 20121 ) and references therein) also fall into the 



domain of the results in Subsection 15. 4| but we abstain from a detailed analysis. See, however, 
Subsection 13.21 for a related analysis. 



3.1.2 Some discussion 

A. In the subsequent discussion we concentrate for definiteness on the test statistic T that is based 
on the estimator il w . However, the discussion carries over mutatis mutandis to the case where the 
alternative estimators Ci discussed in Subsection 13.1.11 are used. 

The negative results in Theorem 13.31 (i.e., size equal to 1 and/or nuisance- infimal rejection 
probability equal to 0) are driven by the fact that, due to Assumption [IJ the covariance model £ 
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has e+e' + and e_e'_ as limiting points; cf. also Remark 13. 4f iv). Suppose now that one would be 
willing to assume that £ does not have any singular limiting point (and is norm-bounded which 
is not really a restriction here). Then the negative results in Theorem 13.31 do not apply. In fact, 
an application of Theorem 15.211 shows that size is now strictly less than 1 and the infimal power is 
larger than 0. Does such an assumption on £ now solve the problem? We do not think so for at 
least two reasons: First, making the assumption that the covariance model £ does not have singular 
limit points (like e+e' + and e_e'_) is highly questionable, especially in view of the fact that the main 
motivation for the development of autocorrelation robust tests has been the desire to avoid strong 
assumptions on £ which could lead to misspecification issues. In particular, in the not unreasonable 
case where £ contains AR(1) correlation matrices A(p), such an assumption would require to restrict 
p to an interval ( — 1 + e, 1 — e) for some positive s. Given the emphasis on unit root and near unit 
root processes in econometrics, such an assumption seems untenable. Second, even if one is willing 
to make such an heroic assumption, size or power problems can be present. To see this assume for 
dcfinitcncss of the discussion that £ = <£ar(i) (e, s) = {A(p) : p 6 (—1 +e, 1 — e)} for some small 
e > 0. As mentioned above, the size of the test based on T will be less than 1 and the infimal power 
will be larger than 0. However, an upshot of Theorem 13.31 still is that the size will be close to 1 
and/or the infimal power will be close to for generic design matrices X, provided e is small (more 
precisely, for given sample size n this will happen for sufficiently small e)o Hence, even under such 
an assumption, size/power problems will disappear only if one is willing to assume a relatively large 
e (in relation to sample size n), making the assumption look even more heroic. 

In the important special case where an intercept is in the regression and the hypothesis tested 
does not involve the intercept, a positive result (i.e., size < 1 and infimal power > 0) can be 
obtained from Theorem 15.211 provided £ has e+e' + as its only singular limit point (or none at 
all) and is norm-bounded. [To be precise Theorem 15.211 requires matrices in £ that approach 
e+e' + to do so in a particular manner. This is, e.g., automatically satisfied for £ = £ar(i) (s, 0) 
defined subsequently, cf. Lemma IG.ll in Appendix [G]] While perhaps a bit more palatable, this 
is still a restrictive assumption on the covariance model £. In the AR(l)-case this amounts to 
£ = $-ar{\) ( e >0) — {A-(/o) : p £ ( — 1 +£, 1)}, and size or power problems will generically still be 
present if e is small as explained in the preceding paragraph. 

In the context of the preceding discussion one should recall that in case £ = £ar(i) Theorems 
3.61 and \3. 71 show how tests, which have size less than 1 and infimal power larger than 0, can easily 
be obtained without any need of bounding \p\ away from unity. Therefore, it would be desirable to 
free Theorems 13.61 and 13.71 from the assumption £ = £ar(i)- To what extent this can be achieved 
without introducing implausible assumptions like the ones discussed in the preceding paragraphs 
will be discussed elsewhere. 



B. In a recent paper IPerron and Renl ( 20111 1 argue that the impossibility results in IPotscher 



( 20021) for estimating the value of the spectral density at frequency zero are irrelevant in the context 



of autocorrelation robust testing: In the framework of a Gaussian location model they compare the 
behavior of common autocorrelation robust tests £_R 6 US t, which are standardized with the help 
of a spectral density estimate / n (0), with a benchmark given by the infeasible test statistic i/(o) 
that uses the value of the unknown spectral density at frequency zero for standardization. They 
find that common autocorrelation robust tests beat the infeas i ble te st statistic along a sequence 



of DGPs similar to the ones that have been used in iPotscherl (J2002I ) to establish ill-posedness of 



9 Of course, size could be reduced by increasing the critical value, but this would come at the price of even further 
reduced power. 
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the spectral density estimation problem. This is certainly true and in fact easy to understand: 
Consider as another benchmark the infeasible test statistic t ideal, say, which uses the (unknown) 
finite-sample variance s„ of the arithmetic mean for standardization rather than the asymptotic 
variance 2irf(0), and observe that this statistic is exactly N(0, 1) distributed (under the null) and 
has well-behaved size and power properties. Because s n does in general not converge uniformly to 
the asymptotic variance 2nf(0) (for the very same reasons that underlie the impossibility result 
in iPotschen (|2002l) ) tf/ \ is not uniformly close to the ideal test tjde.ni ■ The fact that / n (0) is also 



not uniformly close to /(0) (due to the ill-posedness results in iPotscherl ([2002)) is now "helpful" 
in the sense that it in principle allows for the possibility that 2-7r/ n (0) might be closer to the ideal 
standardization factor s n than is 2wf(0), thus allowing for the possibility that tR b U st might be 
closer to the ideal test t^eai than to t/(oV [Observe that 2irf n (0) as well as s n each not being 
uniformly close to 2nf(0) does in principle not preclude (uniform) closeness between 27r/„(0) and 
s n .] In other words, "aiming" at /(0) in standardizing the test statistic is simply the wrong thing to 
do. In that sense, the ill-posedness of estimating /(0) is then indeed irrelevant for autocorrelation 
robust testing (simply beca use the benchm ark t/(o) is irrelevant ) . As a matter of fact, there is no 
statement to the contrary in lPotscherl (J2002J): Note that lPotschen (J2002I ) only discusses ill-posedness 



of the problem of estimating /(0) (considered to be the parameter of interest), and does not make 
any statements regarding consequences of this ill-posedness for autocorrelation robust tests that 
use 2nf n (0) as an es timate of the var i ance n uisance parameter. The claim opening the last but one 



paragraph on p.l in iPerron and Renl (|201ll ) is thus simply false. Finally, the preceding discussion 
begs the question whether or not uniform closeness of 2-7r/ n (0) and s„ can indeed be established 
under sufficiently general assumptions on the underlying correlation structure. If possible, this 
would then immediately transfer the good size and power properties of tideai to tR OOUSt . However, 
unfortunately th i s is no t possible: Recall from Example 13.21 that in the location model considered in 



Perron and Renl ( 20111 ) the size of common autocorrelation robust tests like t Robust is always equal 



to 1. 

3.1.3 Further obstructions to favorable size and power properties 

The negative results given in Theorem l3.3l rest on Assumption!!] i.e., £ D £ar(i), and the fact that 
there exist sequences S m G £ar(i) that converge to the singular matrices e + e' + or e'_e_ leading 
to a concentration phenomenon as discussed in the wake of Theorem 13.31 The commonly used 
nonparamctric covariance models like £^ discussed at the beginning of Section [3] of course also 
satisfy £^ D £ar(p) for every p, where &ar(p) is the set of all n x n correlation matrices arising from 
stationary autoregressive process of order not larger than p. In this case additional singular limit 
matrices arise which lead to additional conditions under which size equals 1 or infimal power equals 
0. We illustrate this shortly for the case where £ D £ar(2)- To this end define for v G (0,7r) the 
matrix E(v) as the n x 2 matrix with i-th row equal to (cos(if), sm(t^)). Furthermore set E(0) = e + 

and E(ir) = e In Lemma TG.2I in Appendix [G] we show that the matrices E(i/)E(i/)' for v G [0,7r] 

arise as limits of sequences of matrices in £,4^(2)- Obviously, E(y)E(i/)' is singular whenever n > 3. 
Restricting v to the set {0, 7r} in the subsequent theorem reproduces the conditions appearing in 
Theorem 13.31 (albeit under the stronger assumptions that £ 2 £a_r(2) an d n > 3). 

Theorem 3.11. Suppose £ D £ar(2), Assumptions^ and\M are satisfied, and n > 3 holds. Let 
T be the test statistic defined in (?p with i> w as in (0). Let W{C) = {y G K" : T(y) > C] be the 
rejection region where C is a real number satisfying < C < 00. Then the following holds: 
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1. Suppose there exists a v G [0,7r] such that rank(P(z)) = q and T(z + /j,q) > C hold for some 
(and hence all) /Uq G 3Jlo and for \ a pa.u{E(v)) -almost all z G span(-B(f)). T/ien 

supP,. >ct2e (W(C)) = 1 



sec 

ZioWs /or every fi € 9JTo and every < a 2 < oo. In particular, the size of the test is equal to 
one. 

2. Suppose there exists a v G [0,7r] such that rank(P(z)) = q and T(z + (1q) < C hold for some 
(and hence all) /ij G 9^0 and for A span ( e(u)) -almost all z G span(P(V)). Then 

mf.P /ioiCT2E (W(C))=0 

ZioMs /or every /i G 9Jto arl ^ every < a 2 < oo, and hence 

inf inf P ^u (W (C)) = 

ZioWs /or every < a 2 < oo. In particular, the test is biased. Furthermore, the nuisance- 
infimal rejection probability at every point /i l G 9Jli is zero, i.e., 

inf inf P„ CT s s (W(C)) =0. 

In particular, the infimal power of the test is equal to zero. 

3. Suppose there exists a v G [0,7r] such that B(z) = and Rf3(z) ^ hold for X span (E{v))- almost 

all z G sp&n(E(v)). Then 

sup P^ a ^(W(C)) = I 
see: 

holds for every fj, G 9JTo a^d every < a 2 < oo. In particular, the size of the test is equal to 
one. 



To illustrate the value added of the preceding theorem when compared to Theorem 13.31 consider 
the following example: Assume that e+ and e_ are both elements of 9JI and Rf3(e + ) = P/3(e_) = 0. 
Then none of the conditions in Theorem 13.31 are satisfied and thus this theorem is not applicable. 
Suppose now that the design matrix X contains E{y) for some v G (0,7r) as a submatrix, i.e., 
seasonal regressors are included. Without loss of generality assume that X = [E(v),X^). If we 
want to test for absence of seasonality at angular frequency v, this corresponds to R — (I2, 0) and 
r = 0. In case Assumption [3] holds, the conditions in Case 3 of the preceding theorem are then 
obviously satisfied and we conclude that the size of the test for absence of seasonality is equal to 
one. [In case Assumption [3] is violated, the test breaks down in a trivial way as noted earlier.] 

We finally ask what happens if we allow for covariance structures deriving from even higher- 
order autoregressive models, i.e., £ 3 <Lar(p) with p > 2. While additional concentration spaces 
arise and theorems like the one above can be easily obtained form Corollary |5.171 these theorems do 
not generate new obstructions to good size and power properties. The reason for this is that any of 
the newly arising concentration spaces already contains one of the concentration spaces span (E(v )) 
for v G [0,7r] as a subset. 
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3.2 Parametrically based autocorrelation robust tests 



The results in Subsection 13.11 were given for autocorrelation robust tests that make use of a non- 
parametric estimator fi. In this subsection we show that the phenomena encountered in Subsection 
3.11 (size distortions and power deficiencies) are not a consequence of the nonparametric nature of 
the estimator, but can equally arise if a parametric estimator is being used (and even if the para- 
metric model employed correctly describes the covariance structure of the errors). We illustrate 
this for the case where the test statistic is obtained from a feasible generalized least squares (GLS) 
estimator predicated on an AR(1) covariance structure, as well as for the case where the test statis- 
tic is obtained from the ordinary least squares (OLS) estimator combined with an estimator for 
the variance covariance matrix again predicated on the same covaria nce structure. The theore tical 
results derive d below are in line with Monte Carlo results provided in lPark and Mitchelll ( 19801 ) and 
Maeed (|l989h . 



We start with the estimator p that will be used in the feasible GLS procedure as well as in the 
estimator for the variance covariance matrix of the OLS estimator. 

Assumption 4. For a\ £ {1, 2} and a,2 £ {n — 1, n} with a\ < ci2 the estimator p is of the form 

n J a 2 

p(y) = Yl My)u t -i{y) / Y u 2 t {y) 

t=2 I t=ai 

( a 2 

for all y £ M. n \No(ai, 02) and it is undefined for y £ Ao(ai, 02) = <y £ M" : J2 ^t (v) = 



The Yule- Walker estimator, which we shall abbreviate by pyw corresponds to a\ — 1, 02 = n, 
while the least squares estimator p LS corresponds to oi = 1, 02 = n — 1. The estimators which 
use «i = 2, Q2 = n — 1 o r a\ = 2, 02 = n have also been considered in the literature (see, e.g., 
Park and Mitchelll (ll980UMageel (|l989h . 



Remark 3.12. (Some properties of p) (i) For the Yule- Walker estimator p YW we have A^o(l,n) = 
971, i.e., Pyw i s well-defined for every y £ R"\97t. Furthermore, pyw ^ s bounded away from 1 
in modulus uniformly over its domain of definition, i.e., sup^^n^ \pyw(v)\ < 1 holds. This 
follows easily from the well-known fact that \pyw(v)\ < 1j that the supremum in question does not 
change its value if the range for y is replaced by the compact set {y £ 9J1 1 - : \\y\\ = l}, and the fact 
that Pvw is continuous on this s et. [It can also be derived from the discussion in Section 3.5 in 



Grenander and Rosenblattl ( 1957 ) 



(ii) The least squares estimator p LS exhibits a somewhat different behavior: First, p LS is well 
defined only on K n \JV (l, n - 1), with N (l, n - 1) given by {y £ K™ : u(y) £ span(e„ (n))}. Note 
that M n \Ao(l, n. — 1) is contained in M n \9Jt, but is strictly smaller in case e„ (n) is orthogonal to 
each column of X. Second, p LS is not bounded away from one in modulus, in fact \pi,g\ > 1 can 
occurl 10 ! 

(iii) The behavior of the remaining two estimators p is similar to the behavior of p^g. 

(iv) The set iVo (01,02) is always a closed subset of R™. It is guaranteed to be a AR^-null set 
provided k < 02 — ai holds, cf. Lemma 13.131 below. This condition on k is no restriction in the case 
of the Yule- Walker estimator (since we have assumed k < n from the beginning), and is a very mild 
condition in the other cases (requiring k < n — 2 or k < n — 3 at most). 



J There are even cases where p LS is unbounded. 
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The definition of the test statistics below will require inversion of A(p) . While A(p) is nonsingular 
if \p\ ^ 1, A(p) is singular if \p\ — 1, and hence we need to study the set of y where \p (y)\ = 1 (or 
p (y) is undefined) . 

Lemma 3.13. Let p satisfy Assumption^ Then DJl C iVo(ai,a2) C TVi (01,02) where 



N 1 (a 1 ,a 2 )^lyeR n 



~J2ut(y)u t -i(y) 



t=i 



= E ^(y) 



The set A?i (01,02) is a closed subset ofW 1 and is precisely the set where the estimator p is either 
not well-defined or is equal to 1 in modulus. The estimator p is continuous on R™\A/o(ai, 0,%) 2 
R™\iVi (01,02). If k < a 2 — ai holds, the set Ni(a\,a2) is a X^n-null set. 

While for the Yule- Walker estimator iVi(l,n) = Nq(1,ti) holds as a consequence of Remark 
I3.12f i). for the other estimators p the corresponding set ATi (01,02) can be a proper superset of 

iV (01,02). 

Given an estimator p satisfying Assumption |4] we now introduce the test statistic 

l(R{3(y)-ryn- l (y)(R~0(y)-r) if y € R»\JV 2 *(ai,a 2 ), 
T ^^ = {o else . 

where 

(3(y) = (X'K-\p( y ))X)- l X'K- l {p{ V )) V , 

d\y) = (n - A;)- 1 ^ - X^y))^- 1 ^^))^ - X[3(y)), 

Cl(y)=a 2 (y)R(X'A- 1 (p(y))X)- 1 R>. 

Here A^ (01,02) is defined via 

R n \N 2 *(a 1 ,a2) = {y€R n \N 2 (a 1 ,a 2 ):a 2 (y)^0,det(R(X'A~ 1 (p(y))X)- 1 R')^0}, 

where A^(ai, 02) is given by 

R n \Ar 2 (oi,o 2 ) = {y e R"\7Vi(oi,a 2 ) : det {X 1 k- l {p(y))X) ^ 0} . 

Note that (3, a 2 , and tl are well-defined on R™\AT 2 (ai, 02), with O (y) being nonsingular if and only 
if y G R n \N2 (fli, 02), see Lemma [B. II in Appendix IB1 Furthermore, define 

J(^(y)-r)'n- 1 (y)W(y)-r) ifj/eR"\A r o*(ai,a 2 ), 
TOLS(2/) = \0 else, 

where /3(y) is the OLS-estimator, a (y) = (n — k)~ 1 u'(y)u(y), and 

Cl(y) = a 2 (y)R(X'X)- 1 X'K(p{y))X{X'X)- l R'. 
Here N^{a\, 02) is defined via 

M n VV *( ai ,a 2 ) = {ye R n \N (a 1 ,a 2 ) : det (R(X' 'X^X' 'A(p(y))X(X' 'X)- l R') / 0} . 
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Of course, J3 and a 2 are well-defined on all of R", while 0, is well-defined on R T *\JVo(oi > 02) 2 
R"\7Vq (01,02). Furthermore, Cl(y) is nonsingular for y £ R™\iVg (01,02), see Lemma IB. 1 1 in Ap- 
pendix [Bj We note that the exceptional sets iVg (01,02) and iV^ (01,02), respectively, appearing in 
the definition of the test statistics are AR^-null sets provided k < 02 — oi holds, see Lemma IB. II 
[For the case of the Yule- Walker estimator actually 7V|(l,n) = iV 2 (l,n) = Ni(l,ri) = AT(j(l,n) = 
No(l,n) = dJt holds, because h.(p Y w (v)) 1S positive definite for every y ^ Nq(1,ti) — dJt in view of 

\Pyw (V)\ < 1, cf. Remark [3331).] 

As already noted in Rcmark l3.12l except for the Yule- Walker estimator we can not rule out that 
p (y) is larger than one in absolute value. For such values of y the matrix A(p (y)), although being 
nonsingular, is indefinite. [To see this, note that det A(p(y)) = (1 — p 2 (y)) n_1 , which is negative 
for \p(y)\ > 1 if n is even. Hence there must exist a negative and a positive eigenvalue. For odd 
n > 1 the claim then follows from Cauchy's interlacing theorem.] In fact, if |p(y)| > I occurs for 
some y, then it occurs on a set of positive An^-measure in view of continuity of p. As a consequence, 
fl (y) and fl (y) are not guaranteed to be Ar« -almost everywhere nonnegative definite (except if the 
Yule- Walker estimator is being used), although they are Ari -almost everywhere nonsingular in case 
k < 02 — 01. Of course, the probability of the event \p (y)\ > 1 will go to zero as sample size goes to 
infinity, but this is not relevant for the present finite-sample analysis and the complications ensuing 
from |p(y)| > 1 have to be dealt with. Fortunately, the theory in Subsection 15.41 does not require 
the estimated variance covariance matrices to be nonnegative definite almost everywhere but only 
requires some weaker properties to be satisfied which are formalized in Assumptions [5] and [7] in 
Subsection 15.41 Lemma \B. 31 in Appendix [Bl shows that f2 and O satisfy these assumptions. 

The subsequent theorem provides a negative result that is similar in spirit to Theorem 13. 31 

Theorem 3.14. Suppose Assumptions[l]andn\are satisfied and k < 02—01 holds. Let Wfgls(C) = 
{y £ E™ : T FGLS {y) > C} and W ls{C) = {y £ R n : T OL s(y) > C} be the rejection regions corre- 
sponding to the test statistics Tfgls ond Tols, respectively, where C is a real number satisfying 
< C < 00. Then the following holds: 

1. Suppose e + £ N$ (01, 02) and Tfgls(&+ + Mo) > @ hold for some (and hence all) Pq £ 9JIq. 
or e_ ^ N£ (01, 02) and TpGLsifi- + Mo) > C hold for some (and hence all) p$ £ $JIq. Then 

su P P^^(W FGLS (C)) = l 
sec 

holds for every p £ 9Jlo and every < a 2 < 00. In particular, the size of the test is equal to 
one. 

2. Suppose e + £ N$ (ai, 02) and Tfgls(z+ + Mo) < @ hold for some (and hence all) p^ £ VJIq, 
or e_ ^ N$ (01, 02) and TpQLsifi- + Mo) < C hold for some (and hence all) p^ £ SQto ■ Then 

hgP^x(W FG Ls(C)) = 

holds for every p £ SUlo an d every < a 2 < 00, and hence 

inf inf P^^e {Wfgls (C)) = 

holds for every < a 2 < 00. In particular, the test is biased. Furthermore, the nuisance- 
infimal rejection probability at every point p x £ 9Jti is zero, i.e., 

inf inf 'P^ a ^{W FGLS (C)) = 0. 
23 



In particular, the inftmal power of the test is equal to zero. 

3. Suppose that e+ G 9H and Rj3{e + ) ^ hold. Then there exists a constant Kfgls ( e +)> which 
depends only on e+, R, X , and £, such that for every p G 9Jto, every a with < a < oo, 
and every M > we have 

inf inf P Mo +7e+j(T 2£ {W FGLS {C)) < K FGLS (e+) < sup P Mo , CT 2 S (W> G ls(C)) ; 

7eH,|7|>MSG£ ° + see: 

./Vote i/iffli /i + 7e + G 9J?i /or 7^0. Furthermore, if p = p^vK; i^era Kfgls ( e +) = 1 a^^ 

Zience 

su P P MoiCT2E (Ty FGLS (C)) = l (9) 

sec 

/iota's /or every p G 9JTo a^rf every < cr 2 < oo. J/ e_ G 9JT and i?/3(e_) 7^ ZioZrf iften 
i/ie analogous statements hold with e + replaced by e_ where the constant Kfgls ( e — ) now 
depends only on e_, -R, X, anrf (£. 

^. Statements analogous to 1.-3. hold true if Tfgls is replaced by Tols> Wfgls (C) *s replaced 
by Wols(C), the set N£ (01,02) is replaced by Nq (01,02), and i/ie constants Kfgls (•) « re 
replaced by constants Kols (■)■ 

The meaning of Parts 1 and 2 of the preceding theorem is similar to the meaning of the corre- 
sponding parts of Theorem 13.31 Part 3 differs somewhat from the corresponding part of the earlier 
theorem, and tells us that, given the conditions in Part 3 are met, there exist points in the alter- 
native, arbitrarily far away from the null hypothesis, at which power is not larger than the size of 
the test. The reason for the difference between Part 3 of Theorem 13.31 and Part 3 of the preceding 
theorem lies in the fact that the variance covariance matrix estimators $7 and Q used in the present 
subsection can be indefinite. This requires one in the proof of the preceding theorem to resort to 
Theorem 15. 191 rather than to using Part 3 of Corollary 15.171 In the case where the Yule- Walker 
estimator p Y w 1S use d, the matrices $7 and Ci are always nonnegative definite, and Part 3 of Corol- 
lary 15.171 as well as Theorem 15.191 deliver the same conclusion (j9]) in this case. Recall also that in 
this case the exceptional null sets appearing in Parts 1 and 2 satisfy N$ (1, n) — Nq (1, n) — 9JT. In 
view of the general results in Subsection 15.41 there is little doubt that similar negative results can 
also be obtained for FGLS or OLS based tests that are constructed on the basis of higher order 
autoregressive AR models or of other more profligate parametric models (as long as C 3 £^(1) 
is assumed). Hence i t is to be expected that autocorrel a tion robust tests b ased on autoregressive 
estimates (cf. iBerkl (J1974J), Iden Haan and Levinl (|l997l ). ISun and Kaplan! (|2012l )) will also suffer 



from severe size and power problems. 

The results given in the preceding theorem reveal serious size and power problems of the tests 
based on Tfgls and Tq-ls- Note that these problems arise even if £ = £ar(i), i- e -j even if the 
construction of the test statistics makes use of the correct covariance model. If C = &ar(i) holds, it 
is interesting to contrast the above results with the size and power properties of the corresponding 
infcasible tests based on T GLS and Tq ls which are defined in a similar way as Tfgls and Tols 
are, but with p replaced by the true value of p: These tests are standard F-tests (except for not 
being standardized by q), have well-known and reasonable size and power properties, and do not 
suffer from the size and power problems exhibited by their feasible counterparts. 

Similar to the situation in Subsection 13.11 the conditions in Parts 1-3 of the preceding theorem 
only depend on ai and 02 (i.e., on the choice of estimator p), the design matrix X, the restriction 
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(R, r), the vector e+ (e_, respectively), and the critical value C '. Hence, in any particular application 
it can be decided whether or not (and which of) these conditions are satisfied. We furthermore note 
that a remark analogous to Remark 13.41 also applies mutatis mutandis to the preceding theorem. 
We also note that a result analogous to Theorem 13.111 could be given here, but we do not spell out 
the details. 

We next show that the conditions of Theorem 13.141 involving the design matrix X are generically 
satisfied. The first part of the subsequent proposition shows that these conditions are generically 
satisfied in the class of all possible design matrices of rank k. Parts 2 and 3 show a corresponding 
result if we impose that the regression model has to contain an intercept. In the proposition the 
dependence of several quantities like TpGLSi Tqls, N$ (oi, a 2 ), etc on the design matrix X will be 
important and thus we shall write Tfgls.x, Tols.x, N£ x (01,02), etc for these quantities in the 
result to follow. 

Proposition 3.15. Suppose Assumption^ holds. Fix (R,r) with rank (R) — q, fix < C < 00, 

and fix a\ G {1, 2} and a-i G {n — 1, n} in Assumption^ Suppose k < 02 — a\ holds. Let Tfgls.x 
and Tqls,x be the test statistic defined above and let /Iq G DJIq be arbitrary. 



1. With Xo defined in Provosition \3.5\ define now 

%i,FGLs(e+) = {X eX :e + e Nl x (a 1 ,a 2 )} , 

%2,fgls (e+) = {X G £o\%i, fgls (e+) : TFGLS,x(e+ + Ho) — C} , 

and similarly define %i,fgls ( e -)j %2,fgls ( e -)- [Note that X2,fgls (e+) an d %2,fgls (e~) 
do not depend on the choice of /j,q.J Then Xi^fgls ( e +) an d %i,fgls (e_) are X^xk-null sets. 
The same is true for X2,fgls ( e +) (%2,fgls (e-)> respectively) under the provision that it is 
a proper subset of £o\%i,FGLS (e+) (Xo\£i,FGLS {&-)> respectively). The set of all design 
matrices X G Xo for which Theorem \S.14\ does not apply is a subset of 

(%1,FGLS ( e +) U X 2 ,FGLS (e+)) (Xl,FGLS {&-) U X 2 ,FGLS (&-)) , 

Hence it is a X^xk-null set provided the preceding provision holds for at least one ofX2,FGLS ( e +) 
or X2,fgls (e~); it thus is a "negligible" subset of Xo in view of the fact that Xo differs 
from M. nxk only by a X^nxk-null set. 

2. Suppose k > 2 and n > 4 hold and suppose X has e+ as its first column, i.e., X = I e + ,X) . 
With Xq defined in Proposition Iff. 51 define 



%i,FGLs(e-) = \X G Xo : e_ G N*, ^ (01,02) | , 

%2,FGLs(e^) = ^XeX Q \Xi t FGLs(e-)-T FGLS ^ +x ^(e^+fi* Q )=CJ, 

and note that X2,fgls ( e -) does not depend on the choice of Ho- Then Xi,fgls (&—) * s 
a X^nx(k-i)-null set. The set X2.FGLS ( e — ) * s a A K »x(*-i)-nw^ set under the provision that 
it is a proper subset of Xo\Xi t FGLS ( e -)- [The analogously defined sets Xi.fgls (e+) and 
X2,fgls (e+) satisfy Xi^fgls (e+) = Xo and X2.FGLS ( e +) = 0-7 The set of all matrices 
X G Xo such that Theorem \3.14\ does not apply to the design matrix X — I e + ,X\ is a subset 
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ofXi^FGLS (e-)UX2,FGLS ( e ~) an d hence is a A K nx(fc-i) -null set under the preceding provision; 
it thus is a "negligible" subset of Xo in view of the fact that Xo differs from R ,ix ( fc_1 ) only by 
a Agnx(fc-i) -null set. 

3. Define Xi^ols (■) an d %2,ols (•) analogously, but with Nq x (0,1,02) replacing A^f x (01,02) 
andToLS,x replacing Tfgls. x- Similarly define Xi,ols (•) a71 ^ 3-2,ols (■)■ Then Part 1 (Part 
2, respectively) holds analogously for Xi^ols (•) o^a 7 X2,ol,s (•) (£i,OLS (•) a»id X2,ols' (*)j 
respectively) with obvious changes. 



4- Suppose X = I e_|_, X ) , and suppose the first column of R is nonzero. Then Part 3 of Theorem 

\3.14\ applies to the design matrix X — I e + ,X) for every X G Xo (for the FGLS- as well as 
for the OLS-based test). 

The preceding genericity result maintains in Part 1 the provision that %2,fgls (e+) is a proper 
subset of Xo\Xi t FGLS (e+) or that X2,fgls (e_) is a proper subset of Xo\Xi > fgls (e~). Note that 
the provision depends on the critical value C. If the provision is satisfied for the given C, we 
can conclude from Part 1 that the set of all design matrices X G Xo for which Theorem 13.141 is 
not applicable to the test statistic Tfgls is "negligible". If the provision is not satisfied, i.e., if 
%2,fgls (e+) = Xo\Xi,fgls (e+) and X 2 ,fgls (e_) = £o\Xi,fgls ( e ~) holds, and thus we cannot 
draw the desired conclusion for the given value of C, we immediately see that the provision must 
then be satisfied for any other choice C of the critical value; hence, negligibility of the set of 
design matrices for which Theorem 13.141 is not applicable to the test statistic Tfgls can then be 
concluded for any C" ^ C. Summarizing we see that the provision is always satisfied except possible 
for one particular choice of the critical value. A similar comment applies to Parts 2 and 3 of the 
proposition! 11 ! 

Similar as in Subsection 13. 1[ we next discuss an exceptional case to which Theorem 13.141 does 
not apply and which allows for a positive result, at least if the covariance model £ is assumed to 
be £a_r(i) or is approximated by £ar(i) near the singular points (in the sense of Remark I3.9J) . 

Theorem 3.16. Suppose £ = <£ar(1)> Assumption R] is satisfied, and k < a 2 — ai holds. Let 
Wfgls{C) = {y G K" : T F gls{v) > O} and W OL s(C) = {y G R" : T ls{v) > C} be the rejection 
regions corresponding to the test statistics Tfgls ond TolSi respectively, where C is a real number 
satisfying < C < 00. If e_|_,e_ G SUt and Rf3(e + ) — i?/3(e_) = is satisfied, then the following 
holds for W(C) = W FG ls(C) as well as W{C) = W OL s(C): 

1. The size of the rejection region W(C) is strictly less than 1, i.e., 

sup sup sup P Mo . (T 2 A(p) (W(C)) < 1. 

^ GOT 0<ct 2 <oo -K/K1 



Furthermore, 



inf inf inf P u a 2 Mo) (W(C)) > 0. 



11 For example, if Tqls i s used, ai = 1, 0,2 = n (Yule-Waker estimator), and X is not restricted to be of the form 
( e+, X ) , it is not difficult to show that the provision is in fact satisfied for every choice of C. This can also be shown 

for other choices of ai and 02 and/or for the case where X = (e+,X) under additional assumptions on R. It may 
actually be true in general, but we do not want to pursue this. 
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2. The infimal power is bounded away from zero, i.e., 

inf inf inf P„ a 2 Mp) (W(C)) > 0. 

5. Suppose that a\ — 1 artrf a2 = u. TTien /or every < c < oo 

inf ^,a»A(p m )(W(C0)->l 

M 1 eOTi,0<o- 2 <oo 
d(Mi,OTo)/CT>c 

ZioWs /or ?7i — > oo arirf /or any sequence p m G (—1,1) satisfying \p m \ — > 1. Furthermore, for 
every sequence < c r „ < oo and every < e < 1 

S& ,^<i p ^.^A( rt (W(c))->i 

fi^aili, — l+e<p<l — e 
d(Mi,OT )>c m 

ftoZds /or to — > oo whenever < a^ < oo and c m /a m —} oo. [The very last statement holds 
even without the conditions e + ,e_ G 2Jt and Rj3{e + ) = i?/3(e_) = O.y 

^. For every 5, < 5 < 1, there exists a C{5), < C{5) < oo, such that 

sup sup sup P liQ .,y2 A(p) (W{C(5))) < S. 

M GOTo 0<cr 2 <oo -Kp<l 



A discussion similar to the one following Theorem 13.61 applies also here. Furthermore, a result 
paralleling Theorem 13.71 can again be obtained by a combined application of Theorem 15.211 and 
Proposition 15.231 The so-obtained result shows how adjusted test statistics Tfqls and Tols can 
be constructed that have size/power properties as given in the preceding theorem also in many 
cases which fall under the wrath of Theorem 13.141 (and for which the tests based on Tfgls and 
Tols suffer from extreme size or power deficiencies). The adjustment mechanism again amounts to 
using a "working model" that always adds the regressors e + and/or e_ to the design matrix. We 
abstain from providing details. 

3.3 Some remarks on the F-test without correction for autocorrelation 

As mentioned in the introduction, a considerable body of literature is concerned with the properties 
of the standard F-test (i.e., the F-test without correction for autocorrelation) in the presence of 
autocorrelation. Much of this literature concentrates on the case where the errors follow a stationary 
autoregressive process of order 1, i.e., £ = £ar(i)- A s the correlation in the errors is not accounted 
for in the standard F-test, bad performance of the standard F-test for large values of the cor r elatio n 
p can be expected. This has been demonst r ated f ormally in iKramerl (1989J), Kramer et al.l ( 19901) . 



and subsequently in iBaneriee and Magnus! ( 20001 ): These papers determine the limit as p — >• 1 of 



the error of the first kind of the standard F-test and show that (i) this limit is 1 if the regression 
contains an intercept and the restrictions to be tested involve the intercept (i.e., the nxl vector 
e + = (1, . . . , 1) belongs to the span of the design matrix and R/3(e+) ^ holds) or if the regression 
does not contain an intercept (i.e., e+ does not belong to the span of the design matrix) and a certain 
observable quantity, A say, is positive, (ii) it is if the regression does not contain an intercept 
and the observable quantity A is negative, and (iii) it is a value between and 1 if the regression 
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contains an intercept but the restrictions to be tested do not involve the intercept (i.e., e+ belongs 
to the span of the design matrix and RJ3(e+) = holds)! 12 ! It perhaps comes as a surprise that 
autocorrelation robust tests, which have built into them a correction for autocorrelation, exhibit a 
similar behavior as shown in Section[3]of the present paper. We mention that, due to the relatively 
si mple structure o f the standard F-tes t stati stic as a ratio of quadratic fo rms, the method of proof 



Kramerl (| 19891 ) . Kramer et al.l (|l990l ). and lBaneriee and Magnus! (|2000l ) is by direct computation 
of the limit (as p — > 1) of the test statistic. In contrast, the results for the much more complicated 
test statistics considered in the present paper rely on quite different methods which make use of 
invarianc e consideration s and are of a more g eomet ric flavor. Needless to say, th e just mentioned 
results in iKramerl ( 19891 ). Kramer et al.l ( 1990 ). and lBaneriee and Magnus! ( 20001) can be rederived 
through a straightforward application of the general results in Subsection l5.4l to the standard F-test. 
In light of the fact that the standard F-test makes no correction for autocorrelation at all, 
a perhaps surprising observation is that nevertheless an analogue to Theorems 13.61 and 13.161 can 
be established for the standard F-test by a simple application of Theorem 15.211 Even more, the 
adjustment procedure described in Proposition 15.231 can be applied to the standard F-test leading 
to a result analogous to Theorem 13.71 While these results show that the size and power of the 
so- adjusted standard F-test do not "break down" completely for extreme correlations, they do not 
tell us much about the performance of the adjusted test for moderate correlations. 



4 Size and Power of Tests of Linear Restrictions in Regres- 
sion Models with Heteroskedastic Disturbances 

We next turn to size and power properties of commonly used heteroskedasticity robust tests. To 
this end we allow for heteroskedasticity of unknown form as is common in the literature and thus 
allow that the errors in the regression model have a variance covariance matrix <r 2 S where £ is an 
element of the covariance model given by 



Cffe 



diag(r 2 , 



,t*):t? >0,i 



1, 



■,n,J2 T "i 



1 



The normalization for S chosen is of course arbitrary and could equally well be replaced, e.g., by 
the normalization t\ = 1. The heteroskedasticity robust test statistic considered is given by 



T He t (y) 



{RP (y) - r)'h H \ t (y) (10 (y) - r) if det (l Het (y) ? 0, 
if det Cl H et{y) = 0, 



(10) 



where &Het = R^He .t R 1 and ^H et is a heteroskedasticity robust estimator. Such estimators were 
intro duced i n Eickerl ( 19631 Il967 ) and have later found their way into the econometrics literature 
(e.g., IWhitd (|l980l )V They are of the form 



* 



Het 



(y) = (X'X)- l X' diag (d^l (y) , . . . , d n u 2 n (y)) X(X'X)- 1 



where the constants di > may depend on the design matrix. Typical choices for di are di — 1, di = 
n/(n — k), di — (1 — ha) , or di = (1 — ha) where hu denotes the i-th diagonal element of the 



1: feaneriee a nd Magnus (2000J) claim in their Theorem 5 that the expression Pr (-F(O) > <5) converges to zero if 
Mi j^ and -F(O) < 8. In case F(0) = <5 the argument given there is, however, incorrect, because -F(O) — > F(0) = 8 
in probability does not imply Pr (-F(O) > 8) — > in general. 
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projection matrix X(X'X) 1 X', see lLong and Ervin ( 20001) for an overview. Another suggestion 



is di = (1 — ha) ' for suitable choice of Si, see ICribari-Netol ( 2004 ). For the last three choices of 



di we use the convention that we set di = 1 in case ha = 1. Note that ha = 1 implies u, (y) = for 
every y, and hence it is irrelevant which real value is assigned to di in case ha = 1. 

Similar as in Subsection 13 . 1 1 we need to ensure that Vtnet (y) is nonsingular Ar™ -everywhere. As 
shown in the subsequent lemma this is the case provided Assumption [3] introduced in Subsection 
13. H is satisfied. The lemma also shows that in case this assumption is violated the matrix Qnet (y) 
is singular everywhere, leading to a complete and trivial breakdown of the test. Recall the definition 
of the matrix B (y) given in ([8]) and note that it is independent of the constants di. 

Lemma 4.1. 1. flnet (y) is nonnegative definite for every y € K™. 

2. SlHet (y) is singular if and only i/rank (B(y)) < q. 

3. ^Het (y) = if and only if B(y) = 0. 

4- The set of all y G W 1 for which Clnet (y) is singular (or, equivalently, for which rank(_B(j/)) < 
q) is either a Am»i -null set or the entire sample space K™ . The latter occurs if and only if 
Assumption^ is violated. 



The proof of the preceding lemma is completely analogous to the proof of Lemma l3.1l and hence 
is omitted. We are now in the position to state the result on size and power of tests based on the 
statistic Tjiet given in (fTOj) . 



Theorem 4.2. Suppose £ 2 ^Het holds and Assumption^ is satisfied. LetTnet be the test statistic 
defined in \10jl and let Wnet{C) = {j/6 1™ : T{y) > C} be the rejection region where C is a real 
number satisfying < C < CO. Then the following holds: 

1. Suppose for some i, 1 < i < n, we have rank(B(e; (n))) = q and Tuetifii (n) + /Uq) > C for 
some (and hence all) Hq £ 9Jto- Then 

su P P^^(W Het (C)) = l 
sec 

holds for every (A € 9Ho a nd every < o— < oo. In particular, the size of the test is equal to 
one. 

2. Suppose for some i, 1 < i < n, we have rank(B(ei (n))) = q and Tnet{e-i (n) + Mo) < ^ f or 
some (and hence all) /1q £ DJlo- Then 

vgP lto>oa v(W Het (C)) = 

holds for every [i Q £ DJIq and every < a 1 < oo, and hence 

inf inf P IH , o *v(W H et(C)) = 

holds for every < a 1 < oo. In particular, the test is biased. Furthermore, the nuisance- 
infimal rejection probability at every point /i 1 6 SCJli is zero, i.e., 

inf inf P Ml ,^i](W ffet (C))=0. 

In particular, the infimal power of the test is equal to zero. 

29 



3. Suppose for some i, 1 < i < n, we have B(ei (n)) = and R$(ei (n)) ^ 0. Then 

sec 

holds for every fj, £ 9Jt an d every < a 2 < oo . In particular, the size of the test is equal to 
one. 

We note that Remark l3.4f i)-fiii) as well as most of the discussion following Theorem 13.31 apply 
mutatis mutandis also here. Similar as in Subsection 13. II it is also not difficult to show (for typical 
choices of di) that the set of design matrices X for which the conditions in Theorem 14.21 are not 
satisfied is a negligible set. We omit a formal statement. In contrast to the case considered in 
Subsection l3.11 however, no (nontrivial) analogues to the positive results given in Theorems 13.61 and 
13.71 are possible due to the fact that in the present setting there are now too many concentration 
spaces (which together in fact span all of R"). Furthermore, the above theorem and its proof exploits 
only the one-dimensional concentration spaces Z t =span(ej (n)). While every linear space of the 
form span(e^ 1 (n) , . . . , e^ (n)) for < p < n and I < i\ < . . . < i p < n is a concentration space 
of the model £, using all these concentration spaces in conjunction with Corollary 15.171 does not 
deliver additional obstructions to good size or power properties, the reason being that each of these 
spaces already contains a concen tration space Zj as a subset. As a further point of interest we note 
that the assumptions imposed in lEickerl ( 19631 Il967l ) require all variances a 2 T 2 to be bounded away 
from zero in order to achieve uniformity in the convergence to the limiting distribution. Hence, 
Eicker's assumptio ns rule out the con centration effect that drives the above result. It appears 
that this insight in lEickerl (|1963l Il967j) has not been fully appreciated in the ensuing econometrics 
literature. In connection with the preceding theorem, which points out size distortions and/or 
power deficie ncies of heteros kcdasticity robust tests even under a normality assumption, a result in 
Section 4.2 of lDufourl (|2003l ) needs to be mentioned which shows that the size of heteroskedasticity 
robust tests is always 1 if one allows for a sufficiently large nonparametric class of distributions for 
the errors U. 

We shortly discuss the standard F-test statistic without any correction for heteroskedasticity. 
Let 



rr(y) 



{{n-k)/ q ){RMy)-r)'{R{X'X)- l R' 





(R(3(y)-r)/(u'(y)u(y)) if y $ M 

iiyeM 



and define W uncorr (C) in the obvious way. It is then easy to see that a variant of Theorem 14.21 
also holds with T uncorr and W uncorr (C) replacing Tn e t and Wnet(C), respectively, if in this variant 
of the theorem Assumption [3] is dropped, the condition rank(B(ei (n))) — q is replaced by the 
condition e^ (n ) £ Sfft, and the conditi o n B(e j (n)) = is replaced by the condition a (n) G 9JI. In 
a recent paper llbragimov and Mullen (|2Q10| ) consider the standard t-test for t e sting /i = versus 
H ^ in a Gaussian location model and discuss a result bv lBakirov and Szkelvl ( 2005 ) to the effect 
that the size of this test under heteroskedasticity of unknown form equals the nominal significance 
level 5 as long as n > 2 and S < 0.08326. It is not difficult to see that in this location problem 
Tuncorr (&i (ft)) = 1 holds for every i (note that /1q = 0) and thus the inequality T uncorr (e, (n)) < C 
always holds whenever C > 1. Hence Case 1 of the vari ant of Theorem 14.21 just di scussed does not 
arise whenever C > 1 which is in line with the results in lBakirov and Szkelvl ( 2005 ). However, note 



that Case 2 of that theorem then always applies (since obviously e, (n) ^ DJl = span (e+)), showing 
that the standard i-test suffers from severe power deficiencies under heteroskedasticity of unknown 
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form in case n > 2 and 5 < 0.08326 (noting that the squared standard ^-statistic is the standard 
F- statistic). 

5 General Principles Underlying Size and Power Results for 
Tests of Linear Restrictions in Regression Models with 
Nonspherical Disturbances 

The results on size and power properties given in the previous sections are obtained as special cases 
of a more general theory that applies to a large class of tests and to general covariance models £ 
(which thus are not restricted to covariance structures resulting from stationary disturbances or 
from heteroskedasticity). This theory is provided in the present section. We use the notation and 
assumptions of Section [2] Since invariance properties of tests will play an important role in some 
of the results to follow, the next subsection collects some relevant results related to invariance. 
In Subsection 15.21 we provide conditions under which the tests considered have highly unpleasant 
size or power properties. This result is based on a "concentration" effect. In contrast, Subsection 
15.31 provides conditions under which tests do not suffer from the size and power problems just 
mentioned. Subsection 15.41 then specializes the results of the preceding subsections to a class of 
tests which can be described as nonsphericity-corrected F-type tests. This class of tests contains 
virtually all so-called heteroskedasticity and autocorrelation robust tests available in the literature 
as special cases. Furthermore, Subsection 15.41 also contains another negative result, the derivation 
of which exploits the particular structure of these tests. 

5.1 Some preliminaries on groups and invariance 

Let G be a group of Borel-measurable transformations of R™ into itself, the group operation being 
the composition of transformations. A function S defined on R™ is said to be invariant under the 
group G if S(g(y)) — S(y) for all y £ R™ and all g £ G. A subset A of R™ is said to be invariant 
under G if g(A) C A holds for every g £ G. Since with g also g~ l belongs to G, this is equivalent to 
g(A) = A for every g £ G, and thus to invariance of the indicator function of A as defined before! 13 ! 
Clearly, invariance of S : R™ — > R, the extended real line, under the group G implies invariance of 
the super-level sets W = {y : S(y) > G}. Furthermore, a function S defined on R™ is said to be 
almost invariant under the group G if S(g(y)) — S(y) holds for all g £ G and all y £ M. n \N(g) 
with Borel-sets N(g) satisfying A R . (N (g)) = and also A R ~ (g'- x (N(g))) = for all g' £ GM A 
subset A of R n is said to be almost invariant if g(A) C A U N(g) holds for every g £ G with the 
Borel-sets N(g) satisfying A R ™ (N(g)) = and A Rn (g'~ 1 (N(g))) = for all g' £ G. It is easy to 
see that this is equivalent to g(A) A A £ N*(g) for every g £ G, with Borel-sets N*(g) satisfying 
A R ™ (N*(g)) = and A R ™ (g'~ 1 (N*(g))) = for all g' £ G; thus it is equivalent to almost invariance 



13 If G is only a collection of bijective transformations on R n but is not a group, then invariance of A does not 
imply g(A) = A in general, and in particular does not coincide with the notion of invariance of the indicator function 
of A. 

14 The additional requirement Agn (g'~ 1 (N(g))) = for all g' £ G of course implies Agn (N(g)) = and may 
appear artificial at first sight. However, it arises naturally in the context of testing problems that are invariant under 
the group G and for which the relevant family of probability measures is equivalent to Ajgn , cf. iLehmann and Romano! 
(2005), Section 6.5. Regardless of this, the additional requirement already follows from Ar™ (JV(<j)) = in case the 
group G is a group of affine transformations on R n , which will be the groups we are interested in. 
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of the indicator function of A. Clearly, almost invariance of S : R n — > M. under the group G implies 
almost invariance of the super-level sets W — {y : S(y) > C}. 

We are interested in some particular groups of afhne transformations. For an affine subspace 9t 
of R n let 

G(5l) = {g a ,uy : a ^ 0, v' e 9T} 

for some fixed but arbitrary v E 9t, where the affine map g a ,u,v' is given by g a ,v,v' (y) — a(y—p) + p' 
with a£R. Observe that G(9T) does not depend on the choice of v (in particular, if 9T is a linear 
subspace, one may choose v = 0). Hence, G(9t) can also be written in a redundant way as 

G(m) = { 5q ,„ x : a ^ 0, i/ e 9T, v' E W} . 

It is easy to see that G(9t) is a group w.r.t. composition which is non-abelian except if 9t is a 
singleton. For later use we also note that 9t as well as R ra \9t are invariant under G(9T), and that 
G(0T) acts transitively on 91 (but not on R™\9T in general). Furthermore, note that the elements of 
G(91) can also be written as g a ,u,v'{y) = ay + (1 — a)u + (y' — v). 

Remark 5.1. We make an observation on the structure of G(9t). Let Gi(9t) denote the collection 
of transformations g a>v ,v(y) for every a ^ and every v E 91, and let G2(¥C) denote the collection of 
transformations gi lV y(y) for every pair u, v 1 E 9T. Obviously, Gi(9T) as well as G 2 (91) are subsets 
of G(9T), and every element of G(9T) is the composition of an element in G^i^X) with an element of 
Gi(9T). While G2(91) is a subgroup, Gi(91) is not (as it is not closed under composition) except in 
the trivial case where *Tt is a singleton. However, the group generated by Gi(9t) is precisely G(9t). 
As a consequence, any function S which is invariant under the elements of Gi(yi) (meaning that 
S{g(y)) = S(y) for all y EM. n and all g E Gi(9t)) is already invariant under the entire group G(9t), 
and a similar statement holds for almost invariance. 

Proposition 5.2. A maximal invariant for G(*Tt) is given by 

Kv) = \R(<n-v,^(y~ v *)l U (m-»«) x (y ~ "*) )' 

where v* is an arbitrary element of 91. The maximal invariant h in fact does not depend on the 
choice of v* E 0t. [Here we use the convention x/ \\x\\ —Q if x = 0./ 

Remark 5.3. Specializing to the case 9T = 97l it is obvious that n/ OTo ^±(y — fj, ) can be 

computed as y — X$ rest (y), where $ rest denotes the restricted ordinary least squares estimator. It 
follows that any test that is invariant under G(9JT ) depends only on the normalized restricted least 

squares residuals, in fact only on (y — Xf3 rest (y)/ y — X(3 rest (y) V [For the tests considered in 

Subsection 15.41 one can obtain this result also directly from the definition of the tests.] 



Consider now the problem of testing Ho versus Hi as defined in (|4]) . First observe that the sets 
9Jto and 3Jli are invariant under the transformations in G(9Jto)- This implies that the parameter 
spaces DJli x (0, oo) x € corresponding to Hi (for i = 0, 1) are each invariant under the associated 
group G(DJIq), i.e., the group consisting of all transformations g a ,n Q ,n' defined on 9H x (0, oo) x £ 
given by 
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where a / 0, /ij £ SUlo, ^d e ^o- [Note that the associated group strictly speaking also depends 
on C, but we suppress this in the notation.] Second, the probability measures associated with Hq 
and Hi clearly satisfy 

P^^s (A) = P Q ( M - Mo )+^, Q 2 CT 2 E (a(A - /i ) + n' ) (11) 

for every (/i,cr 2 ,£) e 9H x (0,oo) x £ and every Borel set A C R n . This shows that the testing 



Droblem considered in (j4|) is invariant under the group G(9JTo) in the sense of lLehmann and Romano 
(2005), Chapters 6 and 8. While trivial, it will be useful to note that (11) continues to hold if £ £ € 
is replaced by an arbitrary nonnegative definite symmetric n x n matrix $. The next proposition 
discusses invariance properties of the rejection probabilities of an almost invariant test ip that will be 
needed in subsequent subsections. As will be seen later, it is useful to consider in that proposition 
the rejection probabilities Eu^^if) also for $ a positive (or sometimes only nonnegative) definite 
symmetric n x n matrix not necessarily belonging to the assumed covariance model £. 

Proposition 5.4. Let ip : R n — > [0, 1] be a Borel-measurable function that is almost invariant under 

G(m ). 

1. For every (/i, a 2 ) 6 9Jt x (0, oo) and for every positive definite symmetric n x n matrix $ the 
rejection probabilities satisfy 

E^^^Lp) = E a ^_ flo - )+fl ^ a 2 a 2 <s ,(ip) (12) 

for all q / 0, /i £ SJIoj Mo S $Ro- [Iff * s invariant under G(9Jlo), then \12\) holds even if $ 
is only nonnegative definite and symmetric] 

2. For every (/i, a 2 ) £ 9Jt x (0, oo) and every positive definite symmetric n x n matrix $ we have 
the representation 



En,<r*$(<p) = En. 



no ^ 0) ,(^ )/^ Mf) = E (u {mo _^ ±i ,-, o)/ ,) +Hi J^ ( 13 ) 



where [i is an arbitrary element o/9Jto- [Note that II/ OT _ )-i~(p — Mo)/ 17 actually does not 
depend on the choice of /i 0; and IL^ _ )-l(ju — /i ) can be computed as /i — Xf3 rest (p,).] 

3. The rejection probability EuoZ&ifp) depends on (//, a 2 ) £ 931 x (0,oo) and 4> only through 

(\ n (OTo-/i )- L ^ _ ^o)/ CT ) >*)• Furthermore, IL (mo _ lio) ± (/i - Mo)/ cr * s * n a bijective corre- 
spondence with (R(3 — r) /a where /3 denotes the coordinates of /i in the basis given by the 
columns of X . Thus the rejection probability E„^ a 2§{<p) depends on (/i,cr 2 ) e 9JI x (0, oo) and 
$ only through (((R/3 - r) /a) , <f>). 



Remark 5.5. (i) For $ = EeC relation (fT2|) expresses the fact that the rejection probability of 
the almost invariant test ip is invariant under the associated group G(9Jto). 

(ii) Setting a = 1 in (|12[) and holding cr 2 and $ fixed, we see that the rejection probability is, 
in particular, constant along that translation of 3Jlo which passes through /i. 

(iii) If /i e 9JIq, choosing /u = fx, a = ct _1 , and fixing /l«q G 93to, shows that E^^^ip) — 
E^i ^{tp). Hence, for ^ G 97to, the rejection probability is constant in {n,o~ 2 ) and only depends on 



:W 



(iv) Occasionally we consider tests tp that are only required to be almost invariant under the 
subgroup of transformations y H> ay+(l — a) /i for a fixed /i G DJIq, i.e., under the group G ({/x }). 
The results in the above propositions can be easily adapted to this case and we refrain from spelling 
out the details. We only note that the analogue to (fT2|) in this case is given by 



Eii,<r*i>(<p) =.E Q ,( Al _ A(o ) +Mo , Q ,2 .2$(^) (14) 

for all a ^ 0. 

Part 2 of the above proposition has shown that the rejection probability depends on the pa- 
rameters only through ((lT OTo s±(fj, — fj, )/<j) ,£]. This quantity is recognized as a maximal 
invariant in the next result. 

Proposition 5.6. Let /i £ StJlo be arbitrary. Then ((n, OT _ \-l(/U — Mo)/ 17 ) >^J * s a maximal 
invariant for the associated group G(%Rq). 

5.2 Negative results 

We next establish a negative result providing conditions under which (i) the size of a test is 1, 
and/or (ii) the power function of a test gets arbitrarily close to zero. The theorem is based on a 
"concentration effect" that we explain now: Suppose one can find a sequence £ m £ £ converging 
to a singular matrix £ and let Z denote the span of the columns of X. Let fj, G 9Ho- Since 
the probability measures P^ ,a 2 a m converge weakly to P„ CT 2g, which has support fi + Z, they 
concentrate their mass more and more around /i + Z. Suppose first that one can show that 
/i + Z is essentially contained in the interior of the rejection region W in the sense that the set 
of points in /i + Z which are not interior points of W has \fi +z -measure zero. It then follows 
that P ft)]ff 3s m (W) converges to P^ ^f, (W) > P^ )0 .2jj (/i + Z) = 1, establishing that the size 
of the test is 1. Now, in some cases of interest it turns out that fx + Z fails to satisfy the just 
mentioned " inferiority" condition with respect to the rejection region W , but it also turns out that it 
does satisfy the " inferiority" condition with respect to an "equivalent" rejection region W', which is 
obtained by adjoining a Au^-null set to W (for example, for W = WL)(n + Z)). Since the rejection 
probabilities corresponding to W and W are identical (as any E G C is positive definite) and thus 
the two tests have the same size, the above reasoning can then be applied to W, again showing 
that the size of the test based on W is 1 for these cases. Part 1 of Theorem 15 . 71 below formalizes this 
reasoning. The same "concentration effect" reasoning applied to M™\W /r instead of W then gives 
p7|) . [The remaining claims in Part 2 as well as Part 3 are then consequences of (jTTJ) combined 
with continuity or invariance properties of the power function.] It should, however, be stressed that 
weak convergence of P fl(v a 2 s m to P„ ^g together with the inclusion /j, + Z C W (except possibly 
for a Ap +2-null set) alone is not sufficient to allow one to draw the conclusion - as tempting as it 
may be - that P^ (j . <J 2 T, m (W) — > 1 although "in the limit" P^ a i^ (W) — 1 holds. Counterexamples 
where P t _ lQ . (y 2 T, m converges weakly to P^ a i^ an d Hq+ Z <ZW (and thus P^ ^^ (W) — 1) holds, 
but where P f _ l0tl7 2 's m (W) converges to a positive number less than 1 are easily found with t he help of 
Theorem 15.101 We furthermore note that in a different testing context iMartellosiol ( 20101) provides 



a result which also makes use of a " concentration eff ect" , but his result is not cor r ect as given. For 



a discussion of these issues and corrected results see iPreinerstorfer and Potscherl (|2013l ) 
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The " concentration effect" reasoning underlying Theorem 15.71 of course hinges crucially on the 
"inferiority" condition (either w.r.t. W or w.r.t. R n \W), raising the question why we should expect 
this to be satisfied in the applications we have in mind, rather than expect that /i + Z intersects 
with both W and R™\W is such a way that the "inferiority" condition is neither satisfied w.r.t. W 
nor w.r.t. M. n \W . Consider the case where Z is one-dimensional, a case of paramount importance in 
the applications, and suppose also that W is invariant under the group G (9Jto). Then we have the 
dichotomy that (fj, + Z)\ {/J } either lies entirely in W or in R™\W / , showing that - except possibly 
for the point fx - the set /i + Z never intersects both W and R n \jy. Moreover, if an element 
of (n + Z)\ {(tig} belongs to the interior of W (of R"\W, respectively), then (fi + Z)\ {{i } in 
its entirety is a subset of the interior of W (of M. n \W, respectively). Hence, under the mentioned 
invariance and for one-dimensional Z, one can expect the "inferiority" conditions in the subsequent 
theorem to be satisfied not infrequently. 

Theorem 5.7. Let W be a Borel set in R™, the rejection region of a test. Furthermore, assume 
that Z is a concentration space of the covariance model (£. Then the following holds: 

1- If Vo £ 9J?o satisfies 

\^ +z (bd(WU(fi + Z))) = 0, (15) 

then for every < a 2 < oo 

supP Mo . ff 2 S (W0 = l 

sec 

holds; in particular, the size of the test equals 1. [In case W is of the form {y G R" : T(y) > C} 
for some Borel- measurable function T : M™ h-> R and < C < oo, a sufficient condition for 
I115\) is that for Xz -almost every z G Z the test statistic T satisfies T(p, + z) > C and is 
lower semicontinuous at /x + z.J 

2. If fi € 9Jto satisfies 

\ fio+z (bd((R n \W)U(» + Z)))=0, (16) 



then for every < a 2 < oo 
and hence 



mi tL P^, a ^(W)=0, (17) 



inf inf P MiiCT2E (W) = 0, 



holds for every < a 1 < oo. In particular, the test is biased (except in the trivial case where 
its size is zero). [In case W is of the form {y £ R" : T(y) > C} for some Borel- measurable 
function T : W l t-y R and < C < oo. a sufficient condition for U6\) is that for Xz -almost 
every z G Z the test statistic T satisfies T(fi a + z) < C and is upper semicontinuous at /i + z.[ 

3. Suppose that condition |_?7| ) is satisfied for some /i G 9JTo and some < a 2 < oo. Further- 
more, assume that W is almost invariant under the group G {{fJ-o})- Then for every fi l G 3Jli 
we have 

inf inf :P ft , jlE (ff) = 0. 

[In case W is of the form {y G 1™ : T(y) > C} for some Borel-measurable function T : R™ i-> 
R and < C < oo, almost invariance of W under the group G ({{i Q }) follows from almost 
invariance ofT under G ({^ }) •/ 
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Remark 5.8. (i) The conclusions of the above theorem immediately also apply to every test 
statistic T" that is Aru -almost surely equal to a test statistic T satisfying the assumptions of the 
theorem. 

(ii) Let ip : M. n H> [0, 1] be Borel-measurable, i.e., a test. If the set {y : (f(y) — 1} satisfies the 
assumptions on W in Part 1 of the above theorem, then for every < a 2 < oo 

sup E^ ,<r*-E<p= 1 

see: 

holds. If the set {y : ip(y) = 0} satisfies the assumptions on M. n \W in Part 2 of the above theorem 
then for every < a 2 < oo 

inf E,. „2yLD = 

see: Mo ' 
holds. A similar remark applies to Part 3 of the theorem, provided p> is almost invariant under 
G(W)- 

Remark 5.9. If the covariance model £ contains AR(1) correlation matrices A(p m ) for some se- 
quence p m G (—1, 1) with p m — > 1 (p m — > —1, respectively), then span (e+) (span (e_), respectively) 
is a concentration space of £ (cf. Lemma [G. II in Appendix [G|l . Hence Theorem 15.71 applies with 
Z = span (e+) (Z = span (e_), respectively). In particular, if £ contains £ar(i) > then Theorem l5.7l 
applies with Z — span (e+) as well as with Z — span (e_). 

5.3 Positive results 

The next theorem isolates conditions under which a test does not suffer from the extreme size and 
power problems encountered in the preceding subsection. In particular, we provide conditions which 
guarantee that the size is bounded away from one and that the power function is bounded away 
from zero. The theorem assumes that the test if — apart from being (almost) invariant under the 
group G(dJlo) - is also invariant under addition of elements of J(£) defined below. This additional 
invariance assumption will be automatically satisfied in the important special case where ip is 
invariant under the group G(9Jl ) and where J(C) C 9Jl — /i for some /i € 9H (and hence for all 
/i G 9#o) as then the maps x n- x + z for z G J(£) are elements of G(DJl ); see also Proposition ^. 231 
and the attending discussion in Subsection 15.41 A second assumption of the subsequent theorem is 
that the covariance model £ is bounded which is typically a harmless assumption in applications as 
it is, e.g., always satisfied if the elements of £ are normalized such that the largest diagonal element 
is 1, or such that the trace is 1. The theorem also maintains a further assumption on the covariance 
model £ related to the way sequences of elements in £ approach singular matrices. This condition 
has to be verified for the covariance model £ in any particular application. A verification for £a.r(i) 
is given in Appendix [G] cf. also Remarks 15.141 and 15.201 
For a covariance model £ define now 

J(£) — I I \ span(E) : dot S = 0, S = lim E m for a sequence S m G £ \ , 

i.e., J(£) is the union of all concentration spaces of the covariance model £. [Note that the subse- 
quent results remain valid in the case where J(£) is empty] 

Theorem 5.10. Let p> : R n — > [0, 1] be a Borel-measurable function that is almost invariant under 
G(9JIq). Suppose that tp is neither \^n-almost everywhere equal to 1 nor Ar™ -almost everywhere 
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equal to 0. Suppose further that 

if(x + z) = tp{x) for every x G R™ and every z G </(£)• (18) 

Assume that £ is bounded (as a subset of M. nxn ). Assume also that for every sequence S m G 
£ converging to a singular E there exists a subsequence (m^i^iq and a sequence of positive real 
numbers s mi such that the sequence of matrices D mi = II S pan (£)iE mi II s pan (wi/s mi converges to a 
matrix D which is regular on the orthogonal complement of span(E) (meaning that the linear map 
corresponding to D is injective when restricted to the orthogonal complement of spanfE - ) ) 15 l. Then 
the following holds: 

1. The size of the test <p is strictly less than 1, i.e., 

sup sup sup£L j0 .2 E (<^) < 1. 

Ai G9J!oO<<T 2 <ooSe£ 



Furthermore, 



inf inf inf E a ^^(w) > 0. 
P eOToO<<T 2 <oo£e£ Fo ' 



2. Suppose additionally that for every sequence v m G 11/^ n_l(37Zi — /i ) with ||^ m || — > oo and 
for every sequence 3> m of positive definite symmetric nx n matrices with $,,„ — > $, $ positive 
definite, we have 

liminf E Vm+ „<s. m (<p) > 0, (19) 

where /x is an element o/9Jlo- [This condition clearly does not depend on the particular choice 
of /i G 9Ho-/- Then the infimal power is bounded away from zero, i.e., 

inf inf inf EL CT 2 S (co) > 0. 

3. Suppose that the limit inferior in Itl9\) is 1 for every sequence v m and $ m as specified above. 
Then for every < c < oo 

inf E ftA (^)^l (20) 

Ai 1 eOTi,0<CT- ! <oo 

ZioWs /or m — > oo and /or any sequence S m G £ satisfying S m — S> E wii/i E a singular matrix. 
Furthermore, for every sequence < c m < oo 

inf f^j^foO -»> 1 (21) 

d( Ml ,OT )>c m 

ZioMs for m — > oo whenever < c^, < oo, c m /a m —¥ oo, ana 7 t/ie sequence S m G £ satisfies 
E m —} E uizi/i E a positive definite matrix. [The very last statement even holds without 
recourse to condition A18\) and the condition on £ following i!8\) .J 



15 Of course, D maps every element of span(S) into zero by construction. 
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The first two parts of the preceding theorem provide conditions under which the size is strictly 
less than 1 and the infimal power is strictly positive, while the third part provides conditions under 
which the power approaches 1 in certain parts of the parameter space, the parts being characterized 
by the property that either 



(i 



(RpW 



r I jo 



is bounded away from zero and E m approaches a 
oo and E m approaches a positive definite matrix. 



singular matrix, or that 

Here fy' is the parameter vector corresponding to /i 1 . Note that d (^ , DJIq) is bounded from above 



Rf3 



(i) 



where the constants involved are positive and 



as well as from below by multiples of 
depend only on X, R, and r. 

Remark 5.11. (i) Because </(£) as a union of linear spaces is homogenous, condition (fT5| is 
equivalent to the condition that <p(x + z) = tp(x) holds for every x € M. n and every z € span ( J(£)). 
(ii) If condition (|19l) in Theorem 15.101 is replaced by the weaker condition 

liminf E dm( )+ Mo ,* m (v) > 0, (22) 

for every fi x £ DJli , for every d m — > oo and every sequence $ m of positive definite symmetric n x n 
matrices with $ m — >■ $, $ a positive definite matrix, then we can only establish for every (i 1 G 9Jl\ 
that 

inf inf E u ..„2 S (tp) > 0. 

as specified above, then for every [i x £ 2ti 



0<rr 2 <oo sgc 

If the limes inferior in (|2"2")l is 1 for every fi 1 , d m , and $ 
and every < a% < oo we have 



inf . „ E u-,,<T*X m (<P) -> 1 



0<rr 2 <o-2 



for any sequence S m S C satisfying S m — > E with E a singular matrix; and also E^ a i Sm (</j) — > 1 
holds whenever cr^„ — » and the sequence E m e £ satisfies E m — > E with E a positive definite 
matrix. [The very last statement even holds without recourse to condition (J18I) and the condition 
on € following (fTSI) .] 



The subsequent theorem elaborates on Part 1 of Theorem 15.101 and shows that under the addi- 
tional assumptions one can not only guarantee that the size of the test is smaller than 1, but one 
can, for any prescribed significance level 5 (0 < 5 < 1), construct the test in such a way that it 
has size not exceeding S. The result applies in particular to the important case where the tests are 
of the form ip c = 1 (T > C) for some test statistic T. Note that for any Ck t °° the sequence of 
tests ip c clearly satisfies condition (|23[) in the subsequent theorem provided {y : T{y) — oo} is a 
ARr.-null set. Thus in this case the theorem shows that for any given significance level 8, < S < 1, 
we can find a critical value C(d) such that the test fc(S) has a size not exceeding 5. 

Theorem 5.12. Let <p k : M 71 — > [0, 1] for k > 1 be a sequence of Borel- measurable functions each of 
which satisfies the assumptions for Part 1 of Theorem 1 5. 1 (A and let C also satisfy the assumptions 
of that theorem. Furthermore assume that the sequence ip k satisfies 



E„ 



;o 



(23) 



as k 'f oo for some /1q G 9JIq and all positive definite symmetric n x n matrices $. Then for every 
5, < 5 < 1, there exists a k Q — k Q (5) such that 

sup sup sup^ 0)0 .2 S ((/? feo ) < S. 

^ eOT o<ct 2 <oo see: 



:-!s 



Remark 5.13. (i) The assumption in Theorem 15.101 that <p k is not A^-almost everywhere equal 
to is of course irrelevant for the result in Theorem 15.121 

(ii) Of course, the second part of Part 1 of Theorem [5T0] immediately applies to ip ko ; and Parts 
2 and 3 of that theorem also apply to (p ko provided <p ko satisfies the respective additional conditions. 

Remark 5.14. (i) In case the covariance model £ equals £ar(i), the boundedness condition in 
Theorems 15. 101 and 15. 121 is clearly satisfied and J(£) reduces to span (e+) Uspan (e_). Furthermore, 
the condition on the covariance model £ in those theorems expressed in terms of the matrices D m is 
then also satisfied as shown in Lemma lG.ll in Appcndix[G] Also note that in this case the sequences 
S m in Part 3 of Theorem 15. 101 converging to a singular matrix are of the form A (p m ) with p m -^ 1 

or P m -» -1- 

(ii) More generally suppose that £ is norm-bounded, has e+e^_ and e_e'_ as the only singular 
accumulation points, and has the property that for every sequence E m £ £ converging to one of these 
limit points there exists a sequence (p m ) m eN in (— 1, 1) such that A(p m ) _1 / 2 £ m A(p m ) _1 / 2 — > I n for 
m — >• oo (that is, near the "singular boundary" the covariance model £ behaves similar to £ar(i))- 
Then J(£) is as in (i) and again the conditions on the covariance model £ in Theorems 15.101 and 
15.121 are satisfied. 

5.4 Size and power properties of a common class of tests: Nonsphericity- 
corrected F-type tests 

In this subsection we specialize the preceding results to a broad class of tests of linear restrictions 
in linear regression models with nonspherical errors and derive a further result specific to this class. 
The class considered in this subsection contains the vast majority of tests proposed in the literature 
for this testing problem. We start with a pair of estimators p and Cl, where Cl typically has the 
interpretation of an estimator of the variance covariance matrix of Rj3 — r under the null hypothesis. 
Similar as in previous sections, the estimators are viewed as functions of y £ M 71 , but it proves useful 
to allow for cases where the estimators are not defined for some exceptional values of y. We impose 
the following assumption on the estimators. 

Assumption 5. (i) The estimators /3 : R"\N -)■ R k and Cl : R"\N -)■ R qxq are well-defined and 
continuous on the complement of a closed X^n-null set N in the sample space 1", with Cl also being 
symmetric on R n \N . 

(ii) The setW n \N is invariant under the group G(DJl), i.e., y £ M. n \N implies ay + Xj G M. n \N 
for every a ^ and every 7 G R fe . 

(Hi) The estimators satisfy the equivariance properties fi{ay + X"/) = a/3(y) + 7 and Cl(ay + 
Xj) = a 2 Cl(y) for every y £ M. n \N , for every a ^ 0, and for every 7 £ M. k . 

(ii) Cl is Ar»i -almost surely nonsingular on M. n \N. 

We make a few obvious observations: First, the invariance of W l \N under the group G(9H) 
expressed in Assumption [5] is equivalent to the same invariance property of N itself. Second, since 
N is closed by Assumption [5j it follows that either N is empty or otherwise must at least contain 
9JI (to see this note that y £ N implies ay £ N for a arbitrarily close to zero which in turn implies 
£ A by closedness of N). Third, given Assumption [S] holds, the sets {y £ R n \N : detCl(y) = 0} 
and {y £ M. n \N : detr2(y) ^ 0} are invariant under the transformations in G(3Jl), and the set 

N* = NU {ye R n \N :det £l(y) = 0} (24) 
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is a closed Ar™-iiu11 set that is also invariant under the transformations in G(97l); cf. Lemnia lF.il 
in Appendix [F] Hence, the set {y G R n \N : det51(y) = 0} could in principle have been absorbed 
into N in the above assumption; however, we shall not do so since keeping the exceptional set N 
as small as possible will lead to stronger results. Furthermore, 971 C N* always holds. To see this 
note that 5Jt C iV C N* holds if N is not empty as noted above; in case N is empty, 51 (y) is 
well-defined for every y and 51(0) — 51(a0) — a 2 51(0) must hold, implying 51(0) = and thus also 
Ct(Xj) = 51(a0 + Xj) = a 2 51(0) = 0. In particular, this shows that either 51 is not defined on 97t 
or is zero on 971. 

Given estimators $ and 51 satisfying Assumption [5] we define the test statistic 

URP(y)-ry(l- l (y)(RMy)-r), yeW n \N*, 
{y) [0, yeN*. [ ' 

We note that assigning the test statistic the value zero at points y s 1" for which either y e N 
or det (51) (y) = holds is arbitrary, but has no effect on the rejection probabilities of the test, since 
N* is a ARn-null set as noted above and since all relevant probability measures -P^ CT 2 S are absolutely 
continuous w.r.t. Lebesgue measure on E™. 

In line with the interpretation of 51 as an estimator for a variance covariance matrix, the leading 
case is when 51 is positive definite almost everywhere. However, sometimes we encounter situations 
where this is not guaranteed for a given fixed sample size (cf. Subsection 13. 2p , although typically 
the probability of being positive definite will go to one for each fixed value of the parameters as 
sample size increases. In order to be able to accommodate also such cases, Assumption [5] does not 
contain a requirement that 51 is positive definite almost everywhere. Nevertheless, in light of what 
has just been said, we shall consider the rejection region to be of the form {y S R™ : T(y) > C} for 
C a real number satisfying < C < oo. 

For some of the results that follow we shall need further conditions on 51 which, however, are 
much weaker than the almost everywhere positive definiteness requirement just mentioned. 

Assumption 6. There exists v € R q , v^O, and a y e W n \N* such that v'Ct^ 1 (y)v > holds. 

Since under Assumption[5]the matrix Ct~ 1 (y) is continuous onK ra \iV*, it follows that Assumption 
[6] in fact implies that v'(l~ l (y)v > holds on an open set of y's. The condition expressed in the 
next assumption is also certainly satisfied if 51 is positive definite almost everywhere. At first glance 
it may seem that this condition rules out the case where 51 (y) is allowed to be indefinite on a set of 
positive Lebesgue measure, but this is not so as v is not allowed to depend on y in this condition. 

Assumption 7. For every v eR q with u^Oue have Ar>> ({j/ G R n \N* : v'Ct~ 1 (y)v = 0}) = 0. 

The following lemma collects some properties of the test statistic that will be useful in the 
sequel. 

Lemma 5.15. Suppose Assumption^ is satisfied and let T be the test statistic defined in \25\) . 
Then the following holds: 

1. The set W n \N* is invariant under the elements o/G(97t). 

2. The test statistic T is continuous on M. n \N* ; in particular, T is Ar™ -almost surely continuous 

onW 1 . 
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3. The test statistic T is invariant under the group G(9Jto)- Consequently, the rejection region 
W(C) = {j/6 R™ : T(y) > C} and its complement are invariant under G(9JIq). 

4- The set {y G 1" : T(y) = C} is a X^n-null set for every < C < oo. 

5. Suppose < C < oo holds. Then {y G R n \N* : T{y) > C} (= {y G R™ : T(y) > C}) is an 
open set in R n , which is guaranteed to be non-empty under Assumption® Consequently, under 
Assumption® the rejection region W(C) contains a non-empty open set and thus satisfies 
X Rn (W(C))>0. 

6. Suppose < C < oo holds. Then {y £ M. n \N* : T(y) < C} is a non-empty open set in R™. 
Consequently, the complement of the rejection region W(C) contains a non-empty open set 
and thus satisfies A K ».(R n \W(C)) > 0. 

7. Suppose Assumption \7\ and < C < oo hold. Then, for every fi Q E 9Jto, every sequence 
v m E IP OTo )i(9JTi — n ) with \\v m \\ — ► oo, and for every sequence $ m of positive definite 
symmetric n x n matrices with $ m — > $. $ a positive definite matrix, we have that 

liminfP^+u n * m (W(C)) = inf Pr (v'Q^f^^G^ > 



inf Pr(v'n- 1 (<& 1/2 G)v>0) (26) 



«6A(((^ m ) m > 1 ) 

where A{{v m ) m>1 ) is the set of all accumulation points of the sequence 

n{x'xy l x'v m / R(x'xy l x'v r , 

A lower bound that does not depend on the sequence v m is as follows: 

lim inf P v +„* (W(C)) > inf Pr (v'Cl- l (^ 2 G)v > O] 

'" ''"' «eH«,||t;||=l V / 

inf Prfw'0" 1 ($ 1/2 G)w > o) 

«eM«,||t;||=l V / 

> Pr(f)($ ' G) is nonnegative definite) . (27) 



«effi9,||u||=i 



In particular, if Cl is nonnegative definite Ar« -almost everywhere (implying that Assumption 
[7\ is satisfied), this lower bound is 1 . 

Remark 5.16. (i) Because A{{u m ) m>1 ) is a closed subset of the unit ball in R 9 and because the 
map v H > Pr (v'(l~ 1 ($ 1 / 2 G)v > 0) is continuous on the unit ball under Assumption [71 we see that 
the expressions in (1261) are positive if and only if 

A K « {{y G R"\iV* : u'JT 1 ^ > 0}) > (28) 

holds for every v G A((^ m ) m>1 ). Under Assumption[7jwe have Ar>> ({y G R"\7V* : v'fl^ 1 (y)v > 0}) = 
A R " ({y E R"\7V* : v'Cl^iyfv > 0}) for every w ^ and hence, by continuity of (l^iy) on R"\iV*, 
condition (j28l for some v ^ is in turn equivalent to v'Cl~ 1 {y)v > for some y = y(v) G M. n \N* . 
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(ii) Let p and ft satisfy Assumption [5] let T be the test statistic defined in (f2"5j) , and suppose 
that we now use a "random" critical value C = C(y) > for y £ R n . Suppose that C satisfies the 
invariance condition C(ay + Xj) = C(y) for every y £ M. n \N, every a ^ 0, and for every 7 £ R fc . 
Rewriting the rejection region {y £ R n : T(y) > C 1 } as {y £ K n : T(y)/C > l} and observing that 
rj(y) = C(y)Vt(y) satisfies Assumption [5] shows that the results of this subsection also apply to the 
test with rejection region {y £ W 1 : T{y) > C\. 

As a corollary to Theorem 15.71 we now obtain negative size and power results for tests of the 
form (|25p . The semicontinuity conditions in Theorem 15.71 are implied by continuity properties of 
the estimators Ct and p used in the construction of the test. The sufficient conditions so obtained 
are easy to verify in practice and become particularly simple in the practically relevant case where 
dim (Z) — 1, cf. the remark following the corollary. 



Corollary 5.17. Let j3 and Ci satisfy Assumption^ and let T be the test statistic defined in \25\l . 
Furthermore, let W(C) = {y £ 1" : T{y) > C} with < C < 00 be the rejection region. Suppose 
that Z is a concentration space of the covariance model (£. Recall that N is the exceptional set in 
Assumption^ and that N* is given by \2J$. Then the following holds: 



1. Suppose we have for some (1$ £ DJIq that z £ R"\./V* and T(p,Q + z) > C hold simultaneously 

Xz -almost surely. Then 

supP^. a 2 S (W(C)) = l 
sec 

holds for every /i £ 93?o and every < a 2 < 00. In particular, the size of the test is equal to 
one. 

2. Suppose we have for some /^j £ 9Jlo that z £ R n \N* and T(/ig + z) < C hold simultaneously 
Xz -almost surely. Then 

m£ £ P fio ,^(W(C)) = 

holds for every /i £ DJIq and every < a 1 < 00, and hence 

holds for every < a 1 < 00. In particular, the test is biased (except in the trivial case 
where its size is zero). Furthermore, the nuisance-infimal rejection probability at every point 
/i x £ 2Jti is zero, i.e., 

inf inf P u a 2^(W(C)) =0. 

In particular, the infimal power of the test is equal to zero. 

3. Suppose Cl is nonnegative definite on W a \N . If z £ R"\7V, Cl(z) = 0, and R/3(z) ^ hold 
simultaneously Xz -almost surely, then 

sec 

holds for every /j £ OJlo <md every < a 1 < 00 . In particular, the size of the test is equal to 
one. 
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Remark 5.18. (i) Since T in the above corollary is invariant under G(SDTo), the condition in 
the corollary does not depend on the particular choice of ^ G 9^0- Furthermore, if Z is one- 
dimensional, the invariance of T shows that T(/ig + z) > C already holds for all z G Z with z ^ 
provided it holds for one z G Z with z ^ 0. In a similar vein, Part 1 of Lemma 15.151 implies for 
one-dimensional Z that z G R n \N* holds for all z e Z with z ^ if and only if z G R"\iV* holds 
for at least one z & Z with z ^ 0. In view of Assumption [5] a similar statement also applies to the 
relations z e R n \N, &{z) = 0, and R/3(z) ^ 0. 

(ii) In case the covariance model £ contains AR(1) correlation matrices, a remark analogous to 
Remark 15.91 also applies here. Furthermore, note that the concentration spaces derived from the 
AR(1) correlation matrices are one-dimensional, and hence the discussion in (i) above applies. 

The negative result in the preceding corollary does not apply if substantial portions of Z belong 
to the exceptional set N (which in particular occurs if Z C 971 holds and N is not empty as then 
Z C 9Jt C N). For this case we provide a further negative result which is applicable provided (|29|) 
given below holds. For example, if Z = span(e + ) and the design matrix contains an intercept, we 
immediately obtain Z C 9Jt, and (|2"9")) holds if and only if the column in R corresponding to the 
intercept is nonzero. The significance of the subsequent theorem is that it provides an upper bound 
K\ for the power in certain directions which is less than or equal to a lower bound for the size. 
This will typically imply biasedness of the test (except if equality holds in ([50)) ). Furthermore, note 
that the result implies that the test has size 1 in case Ct is positive definite AR^-almost everywhere. 
The condition on the covariance model £ is often satisfied, see Remark 15.201 following the theorem. 

Theorem 5.19. Let f3 and Q satisfy Assumptions [5] andtll let T be the test statistic defined 
in HHp, and let W(C) = {y £ M" : T(y) > C} with < C < oo be the rejection region. As- 
sume that there is a sequence S m G € such that S m — > £ for m — > oo where E is singular with 
I := dimspan(I]) > 0. Suppose that for some sequence of positive real numbers s m the matrix 
D m = n span (£)j_£ m n span (£)j_/s m converges to a matrix D, which is regular on span(E)- L 7 and that 

n span /g)i£ m II span (-2)/sTO — > 0. Suppose further that span(S) C 9JI, and let Z be a matrix, the 
columns of which form a basis for span(S). Assume also that 

R(i(z)^Q A span(5:) -a.e. (29) 

is satisfied. Then for every /i G 9JTo, every a with < a < oo, and every M > we have 

inf inf P Mo+Z7 , ct2e (W(C)) < K X < K 2 < sup P ^ (W(C)) . (30) 

The constants K\ and K 2 are given by 

K x = inf Pr (f (7) > 0) = inf Pr (f (7) > 0) 

7£R' I|7|| = 1 

and 

K 2 = Jpr{l(j)>Q)dP a , A (j) 

with the random variable £ (7) given by 

l{ 1 )=(RHZl))^- l ((t l l 2 + D l l 2 )G)Rh{Z 1 ) 
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on the event {{Y}' 2 + D 1 ' 2 ) G G R"\A*} and by £ (7) = otherwise, where G is a standard 
normal n-vector. The matrix A denotes (Z'Z) Z'Y>Z (Z'Z) , which is nonsingular, and Pq.a 
denotes the Gaussian distribution on M. 1 with mean zero and variance covariance matrix A. 

Remark 5.20. Suppose the covariance model £ contains £ar(i)> or j more generally, £ contains 
AR(1) correlation matrices A(p m ) for some sequence p m G (—1,1) with p m — > 1 (p m — > — 1, 
respectively) . Then all the conditions on the covariance model in the preceding theorem are satisfied 
with S = e+e' + , span(E) = span(e+), and Z = e + (S = e_e'_, span(E) = span(e_), and Z = e_, 
respectively); cf. Lemma fG. II in Appendix [Gl Furthermore, condition (|!25|) simplihes to R$(e+) ^ 
{Rfi(e-) ^ 0, respectively). 

The subsequent theorem specializes the positive result given in Theorems 15.101 and 15.121 to the 
class of test considered in the present subsection. 

Theorem 5.21. Let /3 and Cl satisfy Assumptions^ [5[ aniiHj Let T be the test statistic defined 
in J25\) . Furthermore, let W(C) = {y G K™ : T{y) > C} with < C < 00 be the rejection region. 
Suppose further that 

T(y + z) = T(y) for every y G R™ and every z G J(£). (31) 

Assume that £ is bounded (as a subset ofR nxn ). Assume also that for every sequence S m G £ 
converging to a singular £ there exists a subsequence (mj)jgN an d a sequence of positive real numbers 
s mi such that the sequence of matrices D mi = II span (j)i S mi II s[)an w)i /s m- converges to a matrix 
D which is regular on the orthogonal complement o/span(£). Then the following holds: 

1. The size of the rejection region W(C) is strictly less than 1, i.e., 



Furthermore, 



sup sup supP Mo . (T 2 S (W / (C)) < 1. 

M 6®IoO<CT 2 <ooSeC 



inf inf inf P. CT 2 S (W(C)) > 0. 



2. Suppose that A R ~ ({y G R"\A* : v'(l- 1 (y)v > 0}) > for every v G R q with \\v\\ = 1. Then 
the infimal power is bounded away from zero, i.e., 

inf inf inf P u ^(WiC)) > 0. 

3. Suppose that Cl is nonnegative definite Ar™ -almost everywhere. Then for every < c < 00 

inf P^x m (W(C)) -> 1 

M 1 GOTi,0<o' 2 <oo 
d(^ 1 .m )/<r>c 

holds for m — > 00 and for any sequence S m G £ satisfying S m — > £ mi/i £ a singular matrix. 
Furthermore, for every sequence < c m < 00 

inf P Ml , CT = mSm (W(C))^l 

d( Ml ,OT )>c m 
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holds for m — > oo whenever < a 2 m < oo, c m /a m — > oo, and the sequence S TO G €. satisfies 
E m — S> £ wii/i S a positive definite matrix. [The very last statement even holds without 
recourse to condition LSI]) and the condition on € following 131]) .] 

4- For every 6, < 6 < 1, there exists a C{8), < C{8) < oo, such that 

sup sup sup P^x(W(C(5)))< 8. 

M 6 OT oO<cr 2 <ooSeC 

Remark 5.22. (i) In case the covariance model £ equals £ar(i), a remark analogous to Remark 
15.141 also applies here. 

(ii) Under the assumptions of the preceding theorem, the additional condition in Part 2 of the 
theorem is equivalent to v'(l~ 1 (y)v > for every v G M. q with ||t>|| = 1 and a suitable y = y(v) G 
W l \N*. Cf. Remark EHU;i). 

We now discuss when the preceding theorem can be expected to apply and how the crucial 
condition (f5T|) can be enforced. As already noted prior to Theorem 15.101 a sufficient condition for 
(j3~lj) to be satisfied for any test statistic T of the form ([23]) . based on estimators /3 and (l satisfying 
Assumption [5] is that J(£) C 97to — ^o f° r some (and hence all) [i G 97t holds. This sufficient 
condition is equivalent to J(£) C 371 and R$(z) = for every z G ./(£), because 97t — ^ coincides 
with the set I \x G 971 : RJ3(fi) = f . [Note that replacing J(C) by span (J(£)) in the preceding two 
sentences leads to equivalent statements because 97to — ^ as well as 971 are linear spaces.] Now 
consider the general case where J(<£), or equivalently span ( J (<£)), may not be a subset of 9Jto — Mo : 
If there exists az£ span ( </(£)) n97t with z £ 9Jto — Mo (i- e -j with R(3{z) ^ 0), then any test statistic 
T of the form (|23|) , based on estimators /3 and Cl satisfying Assumptions [5] and does noi satisfy 
the invariance condition (f5T|) . see Lemma fF. 3 1 in Appendix [F] Hence, span(J(C)) n 971 C 37lo — Mo> 
or in other words R$(z) = for every z £ span ( </(£)) n SUt, is a necessary condition for ([31]) to be 
satisfied for some T as above. We next show how a test statistic of the form (f25j) satisfying the 
crucial invariance condition (|31|) can in fact be constructed if we impose this necessary condition. 



Proposition 5.23. Let €. be a covariance model and suppose that span(J(£)) n 97t C 97to — /J-q 
holds. 

1. Let 971 be the linear space spanned by </(£) U 971. Define X = (X, Xi, . . . ,x p ) where Xi G 
span (</((£) U (9Jto — Mo)) are chosen in such a way that the columns of X form a basis of 
971. Assume that k < k + p < n holds. Suppose 9 and f2 are estimators satisfying the 
analogue of Assumption \5\ but with k replaced by k + p, X replaced by X, and 971 replaced 
by 971. Let N denote the null set appearing in that analogue of Assumption [5| and N* = 
N U {y G R n \N : detfi(y) = 0}._ Define (3 = (J*, 0)0. Then /? and Cl satisfy the original 
Assumption^ (with N given by N ), and the test statistic T given by 

f = f (R(3(y) - ryn~\y)(R(3(y) -r), y G R" \N* , 
\0, yeN*. 



satisfies the invariance condition \31_ 
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2. Let 971 and X be as above and k < k + p < n. Suppose 9 (y) = (X'X) X'y is the least 
squares estimator based on X . Then the requirements on 9 postulated in the above mentioned 
analogue of Assumption^ are satisfied, and R/3 (z) = holds for every z £ span (J((£)). 
Furthermore, if X* = (X, x*, . . . , X*j is obtained in the same way as is X but for another 
choice of elements x* £ span(J(£) U (97to — A'o)) an( ^ tf@* denotes the least squares estimator 
w.r.t. the design matrix X* , then R/3(y) — R[3*(y) holds for every y £ R n with f3* denoting 
(h,0)9*. 

We next discuss ways of choosing x\,...,x p such that they satisfy the requirements in the 
preceding proposition: One natural way is to first find Z\ ■ . ■ , Z r in </(<£) that form a basis of 
span J(<£). From these vectors then select x~\ = Zi ± . . . ,x p = z% to complement the columns of X 
to a basis of 971. An alternative way is based on the observation that adding elements of 97to — Mo 
to each of the previously found Zj. obviously gives rise to another feasible choice of a;,. It hence 
follows that an alternative feasible choice for the x% is to use the projections of the z, onto the 
orthogonal complement of 97to — A*o- ^f course, if the estimator 9 is chosen to be the least squares 
estimator, then Part 2 of the preceding proposition informs us that the particular choice of the Xi 
has no effect on Rf3(y) since it is invariant under the choice of the aV [In particular, it follows that 
/3(y) = $(y) holds for the choice Xi = Zy — IIgjto-/i z y J us t discussed.] 

Remark 5.24. (i) Suppose that the assumptions of Proposition 15.231 hold, except that now p = 
holds. Then J(£) C 971 and hence R(3(z) — holds for every z £ </(<£), implying that actually 
the sufficient condition mentioned prior to the proposition is satisfied. Consequently, as discussed 
above, the invariance condition pip is already satisfied for every T of the form (|25p based on 
estimators [3 and Cl satisfying Assumption [5] 

(ii) Suppose that the assumptions of Proposition 15.231 hold, except that now k + p = n holds 
(note that k+p < n always holds). Suppose further that T is a test statistic of the form (1231) based 
on estimators /3 and fi satisfying Assumptions [5] and [51 Then T can never satisfy (|31[) and hence 
Theorem 15.211 does not apply in this situation. This can be seen as follows: Because of k +p = n it 
follows that every y £ R™ can be written as a linear combination of finitely many z% £ </(<£) plus an 
element \i in 371. Because invariance w.r.t. addition of elements z £ </(<£) is equivalent to invariance 
w.r.t. addition of elements z £ span(J(£)) (cf. Remark 15. llf i)) we see that T{y) = T(/j,) would 
have to hold under (|3"Tjl . As noted after the introduction of Assumption [SJ either 971 C N C N* 
holds or N is empty. In the second case we have that Cl(fi) — as a consequence of equivariance. 
Hence in both cases we arrive at /i £ N* and thus at T(p) = 0. But this shows that T is constant 
equal to zero, contradicting Part 5 of Lemma 15.151 

(iii) Proposition 15.231 uses the auxiliary matrix X and the associated estimators 9 to construct 
an estimator [3 for the parameter /3 in the originally given regression model (TTJ) and this estimator /3 
is then used to construct a test statistic T for the testing problem (J3J) to which Theorem l5.21l can be 
applied. In an alternative view we can consider the auxiliary model Y = X9+XJ with 9 = [B , (, J as 
a model in its own right. [Of course, if we maintain model ([T]) then £ = must hold in the auxiliary 
model.] Define the q x (k+p) matrix R = R (I k , 0), define 97T = {(J- £ 971 : fi = X9, R9 = r} and 
set 97li = 97l\97to, and define a null hypothesis H and an alternative hypothesis Hi analogously 
as in (U). Proposition 15.231 can now be viewed as stating that condition (I5T|) is satisfied for the test 
statistic which is obtained by using (|25[) based on the restriction matrix R and on the estimators 9 
and Ct figuring in Proposition [03] Consequently, Theorem 15.211 can be directly applied to this test 
statistic (provided 0, satisfies Assumptions [5] and [7]) . It should be noted that the so-obtained result 
now applies to the problem of testing ^o versus Hi. However, since 97to Q 97to and 97Ti C 97li hold 
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and since T is invariant under translation by elements in span(J(£)), we essentially recover the 
same result as before. 

5.5 Non-Gaussian distributions 

As already noted in Section [21 the negative results given in this paper immediately extend in a 
trivial way without imposing the Gaussianity assumption on the error vector U in ([TJ) as long as 
the assumptions on the feasible error distributions is weak enough to ensure that the implied set 
of distributions for Y contains the set {-P^.^s : H G 2Jt, < a 2 < oo, £ G £}, but possibly contains 
also other distributions. 

Another, less trivial, extension is as follows: Suppose that U is elliptically distributed in the 
sense that it has the same distribution as gerE^E where < a < oo, S G £, E is a random vector 
uniformly distributed on the unit sphere 5" 1-1 , and g is a random variable distributed independently 
of E satisfying Pr(£> > 0) = 1. [If g is distributed as the square root of a chi-square with n degrees 
of freedom we recover the Gaussian situation described in Section [2j] If ip is a test that is invariant 
under the group G(DJIq) then it is easy to see that for /i 6 DJlo 

E(^ + ecrE^E)) = E(^ + ^ 1/2 E)) 

holds |^f| Since this does not depend on the distribution of g at all, we learn that the rejection prob- 
ability under the null hypothesis is therefore the same as in the Gaussian case. As a consequence, 
all results concerning only the null behavior of ip obtained under Gaussianity in the paper extend 
immediately to regression models in which the disturbance vector U is elliptically distributed in the 
above sense. Furthermore, all results concerning rejection probabilities under the alternative which 
are obtained from the behavior of the null rejection probabilities by an approximation argument 
(e.g., Parts 2 and 3 of Theorem 15 .71 as well as of Corollary 15 .171 and the corresponding applications 
of these results in Sections and 2]) also go through in view of Scheffe's lemma provided the density 
of £>E exists and is almost surely continuous. 

A Appendix: Proofs for Subsection 13.11 

Proof of Lemma l3.lt Observe that Cl w (y) = B (y) W n B' (y). Given that W n is positive definite 
due to Assumption [21 this immediately establishes Parts 1-3 of the Lemma. It remains to prove Part 
4. Let s be as in Assumption [3] and consider first the case where this assumption is satisfied, i.e., 
where rank (R(X'X)~ 1 X' ( _i (ii, . . . is))) = 9 holds. If now y is such that Cl w (y) is singular it follows, 
in view of the equivalent condition rank (B(y)) < q, that Ui(y) = must hold at least for some I ^ 

{ii, . . . is} where I may depend on y. But this means that y satisfies ej(n) I I n — X (X'X) X') y = 
0. Since ej(n) (l n — X (X'X) X') ^ by construction of I, it follows that the set of y for which 

Q-uu (y) is singular is contained in a finite union of proper linear subspaces, and hence is a AR«-nuil 
set. Next consider the case where Assumption [3] is not satisfied. Observe that then s > must 
hold. Note that iti(y) = holds for all y G K" and all i £ {«i, . . . i s } by construction of {i\, . . . i s }- 



16 Under an additional absolute continuity assumption this is also true for almost invariant tests ip. 
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But then for every y G R" 

rank(B(j/)) = rank (^(X'X^X'Hn,. . .*,)) ,%)) 
< nmk(R(X'X)- 1 X'(-^(i 1 ,...i 3 ))) <q 

is satisfied where A(y) is obtained from diag {u\{y) 1 . . . , u n {y)) by deleting rows and columns z with 
i G {ii, . . . i s }- This completes the proof. ■ 

Lemma A. 1. Suppose Assumptions^ and\3[ are satisfied. Then (3 andCl w satisfy Assumption\^\^ 
and [?] with N — 0. In /aci, Cl w (y) is nonnegative definite for every y G R n , and is positive definite 
Aru -almost everywhere. The test statistic T defined in FJd, with ty w as in {3|), is invariant under the 
group G (£D?o) and the rejection probabilities -P AljCT 2 E (T > C) depend on ([i, a 2 , SJ G 971 x (0, oo) x £ 
on/j/ through ((R0 — r) /a, S) (in fact, only through (((R/3 — r) /a) , T,)), where corresponds to \x 
via [i = X0. 

Proof. Clearly, and Ct w are well-defined and continuous on R™, hence we may set N = in 
Assumption [5] Symmetry of Q w as well as the required equivariance properties of and £l w 
are obviously satisfied. By Assumption [5] Cl w (y) is nonnegative definite for every y G R". By 
Assumptions [2] and [3] and Lemma 13.11 the matrix Q w is nonsingular (and hence positive definite) 
AMr.-alm.ost everywhere. Hence Assumptions 03 El and [7] are satisfied which proves the first claim. 
The remaining claims follow immediately from Lemma 15.151 and Proposition 15.41 □ 

Proof of Theorem l3.3t By Lemma fA.ll we know that and il w satisfy Assumption [S] and that 
Cl w (y) is nonnegative definite for every y G R™. Furthermore, in view of this lemma and because 
N = 0, the set N* in Corollary 15 .171 is precisely the set of y for which rank(B(y)) < o, cf. Lemma 
13.11 By Assumption [l] the spaces Z+ = span(e + ) and 2- = span(e_) are concentration spaces of 
£. The theorem now follows by applying Corollary 15 .171 and Remark l5.18f i) to Z + as well as to Z_ 
and by noting that e + G R"\A^* translates into rank(B(e + )) = q with a similar translation if e+ 
is replaced by e_. Also note that the size of the test can not be zero in view of Part 5 of Lemma 
15.151 and Lemma [A. II ■ 

Proof of Proposition [3751 (1) Define the matrix B* x (y) = (det(X'X)) P>x (y) and observe 
that (for given y) every element of this matrix is a multivariate polynomial in the elements Xu of X 
because (X 1 X)~ l can be written as (det(X'X)) - adj(X'A) (with the convention that adj(A'A) = 
1 if k = 1). Because det(A'A) 7^ for X G Xo holds, we have 

Xi (e+) = X n {X G R nxfe : det {B* x {e+)B%{e+)) = 0} . 

The set to the right of the intersection operation in the above display is obviously the zero-set of 
a multivariate polynomial in the variables Xu- Thus it is an algebraic set, and hence is either a 
ARnxfc-null set or is all of R™ xfc . However, the latter case can not arise because we can choose an 
n x k matrix X# G Xo, say, such that all its columns are orthogonal to e+ (this being possible 
since k < n by assumption) and this matrix then satisfies rank (B x# (e+)) — q. This shows that 
Xi (e+) is a A R „xfc-null set. Next consider X2 (e+): Observe that for X G Xo\Xi (e+) we have 
det(tl w ^x ( e +)) 7^ and hence for X G Xo\Xi (e+) the relation Tx(e+ + /j,q) = C can equivalently 
be written as 

(R&dj(X'X)X'e + )' &djn w , x (e+) (i?adj(A'A)A'e+) - (dct(X'X)f dct(£l w , x (e+))C = 0. 
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Furthermore, for X G Xo we can write Cl W: x (e+) as (det(X'X)) B* x {e + )W n B* x {e + ). Note 
that B x {e+)W n B x (e+) is a multivariate polynomial in the variables Xu- Consequently, for X G 
Xo\Xi (e + ) the relation Tx{e + + Ho) = C can, after multiplication by (det(X'X)) q ~ , which is 
nonzero for X G Xo, equivalently be written as 

(clct(A'X)) 2 (i?adj(X'X)X'e + )'adj (B£ (e+) W n B% (e+)) (i?adj(X'X)X'e+) 

-det(B^( e+ )W nJ B^(e + ))C = 0. 

The left-hand side of the above display is now a multivariate polynomial in the elements x t i- 
The polynomial does not vanish on all of M. nxk since the matrix X# constructed before provides 
an element in Xo\Xi (e+) for which T x #{e + + Ho) = < C holds. The proofs for Xi (e_) and 
X2 (e_) are completely analogous, as is the proof for the fact that R™ xfc \Xo is a A Rn xk-null set. 
Finally, that the set of all design matrices X G Xo for which Theorem [33] does not apply is a subset 
of (Xi (e+) U X2 (e+)) fl (Xi (e_) U X2 (e~)) is obvious upon observing that the set of all X e Xo 
which do not satisfy Assumption [3] is contained in Xi (e+) as well as in Xi (e_). 

(2) Similar arguments as in the proof of Part 1 show that Xi (e-) and X2 (e~) are each contained 

in an algebraic set. Define the matrix X" = ( e + , X$ ) where the columns of X$ are k — 1 linearly 

independent unit vectors that are orthogonal on e + as well as e_. It is then easy to see that 
X** G Xo\Xi (e_), implying that Xi (e_) does not coincide with all of Xo- Furthermore, simple 
computation shows that T x »(e— + Ho) — < C by the assumption on R, which implies that 
X2 (e_) is a proper subset of Xo\Xi (e_). It follows now as above that Xi (e_) and X2 (e_) are 
A R nx<fc-i)-null sets. The rest of the proof now proceeds as before. 

(3) See Example EH ■ 

Proof of Theorem 13. 6t We verify the assumptions of Theorem l5.21l By Lemma [Aj] Assump- 
tions EJ[6j and [7] are satisfied. Because of £ = £ar(i) we have that J (£) = span(e+) U span(e_), 
see Lemma [G~T1 and because e+, e_ G ST is assumed we conclude that J (<£) C OT. The assumption 
Rf3(e + ) — i?/3(e_) = then implies that even J (£) C OTo — Hq holds. The invariance condition (|31[) 
in Theorem l5.21l is thus satisfied, because T is G (3Tto)-invariant by Lemma T5.15I The assumptions 
on £ in Theorem l5.21l are satisfied in view of Lemma IG. II Finally the assumptions on tt w in Parts 
2 and 3 of Theorem 15.211 arc satisfied because Cl w is positive definite Ar« -almost everywhere as 
shown in Lemma IA.1I The theorem now follows from Theorem 15.211 using a standard subsequence 
argument for Part 3. The claim in parenthesis in Part 3 follows from the corresponding claim in 
parenthesis in Theorem 15.211 and the observation that the conditions on e+ and e_ in the theorem 
were only used to verify condition (|31[) . ■ 

Proof of Theorem I3.7t Similar as in the preceding proof verify the assumptions of Theorem 
I5.21l but now for f3 and Q w by making use of Proposition l5.23l Note that the condition span ( </(<£)) fl 
971 C 9Ho— Ho 1S satisfied in all five parts of the theorem. This is obvious for Parts 1-3. For Part 4 this 
follows from the following argument: Observe that e_ = 5e + + Xj must hold by the assumptions 
of Part 4. Now suppose m G span(J(£)) fl 971. Then a + e + + a_e_ = m = Xj* must hold. 
These relations together imply (a + + a^5)e + = X(j* — a_7). Because e+ ^ 971, it follows that 
7* — a_7 = 0. Thus 

R{X'Xy l X'm = R 1 * = a_i? 7 = a-R(j' : 6)' = a^R (X'Xy 1 X'e- =0, 

which establishes that m G 9Jlo — /x . The verification for Part 5 is completely analogous. ■ 

Proof of Lemma l3?T0l Since Cl w (y) = nB (y)W*B' (y), Parts 1-3 of the Lemma follow 
immediately from nonnegative definiteness of W*. To prove Part 4 observe that Q w (y) is singular 
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if and only if det (B (y) W^B' (y)) = 0. Now observe that the l.h.s. of this equation is a multivariate 
polynomial in y, hence the solution set is an algebraic set and thus is either a A^n-null set or all of 
R™. ■ 

Proof of Theorem 13. lit The proof is completely analogous to the proof of Theorem l3.3l using 
Lemma TG. 2 1 in case v G (0,7r). ■ 

B Appendix: Proofs for Subsection 13.21 

Proof of Lemma I3.13t The inclusion 9Jt C No(ai,a,2) is trivial since u (y) = for y G dJl. 
Because a\ G {1,2}, a-i G {n — l,n} with oi < a 2 holds, No{ai,a 2 ) is contained in N\(a\,a-i), 
establishing the first claim. Closedness of N\(a\, a 2 ) is obvious. Given the just established inclusion 
iVo(ai,a2) C iVi (01,02) the alternative description of TVi (01,02) given in the second claim is also 
immediately seen to be true. Continuity of p on R n \A r o(ai,a2) is obvious. Assume now that 
k < 02 — oi holds. If oi = 1, 02 = n, i.e., p = p Y W' we nave -^1(01,02) = ^0(01,02) = 9JI because 
Pyw i s well-defined and bounded away from one in modulus on R"\9ft as shown in Remark l3.12f i\ 
Hence, TVi (01,02) is a Am-- null set in this case as k < n holds by assumption. To establish this 
result also for the other choices of oi and 02 note that iVi (01,02) is the zero set of a multivariate 
polynomial in y. It hence is a Ar»- null set, provided we can show that the polynomial is not 
identically zero. Observe that we now have n — k > n — 02+01 > 2 (as we have already disposed 
off the case oi = 1, a^ = n). Let y^\ ■ ■ ■ , j/" - *^ be a basis for Oft- 1 . The submatrix obtained 
from (2/ 1 * 1 , . . . , 2/( n_fe )) by selecting the rows with index j satisfying j < a\ as well as the rows with 
j > a 2 has dimension (n — 02 + oi — 1) x (n — k) and thus has rank at most n — 02 + oi — 1 < n — k. 
Consequently, we can find constants c\, . . . , c n -k, not all equal to zero, such that the j-th component 
°f 2/o = ST=i c iV i s zero whenever j < oi or j > a 2 . Because yo G 9H^ and yo 7^ by construction, 
we have yo S R™\9Jt = R n \-/V (l, n). Because yo 7^ and because the j-th component of y — u(y ) 
is zero whenever j < oi or j > 02, we also have yo G W n \No(ai,a 2 ). Hence, p(yo) as well as 
Pyw (Vo) are well-defined. Furthermore, they coincide in view of the construction of yo = u(yo)- 
By what was said above for the Yule- Walker estimator it follows that |p(yo)| = \Pyw (Vo)\ < 1- 
Hence yo G M. n \Ni (01,02), and the polynomial is not identically equal to zero. ■ 

Lemma B.l. Suppose p satisfies Assumption^ 

1. The sets W l \No(a\, 02), W n \Ni(ai,a 2 ), and W l \N 2 (a\,a 2 ) are invariant under the group of 
transformations y h-> ay + Xj where a ^ 0, 7 G R fc . 

2. The estimators (3, a , and f2 are well-defined and continuous on M. n \N 2 (ai,a 2 ). They satisfy 
the equivariance conditions $(ay + Xj) — a/3(y) + 7, a (ay + Xj) — a 2 a (y), and tl(ay + 
X7) = a 2 Vt(y) for a ^ 0, 7 G R fc , and y G R™\7V 2 (ai, a 2 ). The estimator Q, (y) is (well-defined 
and) nonsingular if and only if y G R"\A^2 (ai, 02). The sets N 2 (a\,a 2 ) and ^3(01,02) are 
closed. If k < 02 — ai holds, N 2 (a\,a 2 ) and ^2(01,02) are X^-null sets. 

3. The estimator tt is well-defined and continuous on R™\7Vo(ai, 02), whereas (3 and a are 
well-defined and continuous on all of R™ . They satisfy the equivariance conditions 0(ay + 
Xj) = aJ3(y) + 7, a 2 (ay + Xj) = a 2 dr 2 (y) for a ^ 0, 7 G R fe , and y G R", as well as 
(l(ay + Xi) = a 2 Cl(y) for a ^ 0, 7 G R fe , and y G R n \N Q {a 1 ,a 2 ). Furthermore, a 2 (y) > 
holds for y G R"\9J1 ^ R"\A^q (ai, 02), and hence Q(y) is (well-defined and) nonsingular if 
and only if y G R"\-/Vq (ai, 02). The set Nq (ai, 02) is closed. If k < a 2 — ai holds, Nq (ai, 02) 
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is a \^-null set. [Recall from Remark \3.12]f iv) that #0(01,02) is always a closed set, and is 
a Xrh-tiuII set in case k < a% — a%.J 

Proof. (1) The invariance of the first two sets follows since u{ay + Xj) = otu{y) holds for every 
y G R", a ^ 0, and 7 G R fc . This property of the residual vector implies p(ay + Xj) = p(y) 
for every a ^ 0, 7 G R fc and y G R n \#o(oi,a 2 ) 2 R n \#i(ai,a 2 ). Together with the already 
established invariance of R ra \AT 1 (ai, a 2 ) this implies invariance of R n \iV 2 (ai,a 2 ) upon observing 
that A _1 (p(y)) is well-defined for y G M n \iV 1 (ai, a 2 ). The latter holds because for b G R, \b\ / 1 
the matrix A(6) is nonsingular. [This can, e.g., be seen from the fact that its inverse is given by the 
symmetric tridiagonal matrix with diagonal equal to (l,l + 6 2 ,...,l + 6 2 ,l)/(l— 6 2 ) and with 
the elements next to the diagonal given by —bj (l — 6 2 ).] 

(2) Using Lemma 13.131 and the just established fact that A~ 1 (p(y)) is well-defined for y G 
M. n \Ni(ai,a 2 ), we see that /3, a 2 , and tl are well-defined and continuous on M. n \N 2 (ai,a 2 ) C. 
« n \N 1 (a 1 ,a 2 ). Observing that p(ay+Xj) = p(y) holds for a ^ 0, 7 G R fc , and y G M n \N a {a 1 ,a 2 ) 2 
M. n \Ni(ai 7 a 2 ), the claimed equivariance of (3, a 2 , and $7 follows. The third claim is obvious, and 
the fourth claim follows easily from Lemma f3. 131 We next prove the last claim for the Yule- Walker 
estimator, i.e., for a\ = 1 and a 2 = n: For this it suffices to show that .$2(1,71) C 971 since 971 is 
a proper subspace of R™ in view of the assumption k < n. Now for arbitrary y ^ 9H = iVo(l,7i) 
we have that pywiv) i s well-defined and satisfies \pyw{v)\ < 1 ( c f- Remark l3.12r i)) implying 
y G M. n \Ni(l,n) as well as positive definiteness of A(p Y w(y))- ^ u ^ this gives positive definiteness, 
and hence nonsingularity, of X'A~ 1 (p YW (y))X, implying that y G R™\-/V 2 (l, 2). It also delivers 
positive definiteness of R(X' 'A -1 {p{y))X)^ 1 Rf '. Furthermore, y ^ 9Jt implies y — X[3{y) ^ and 
thus <7 2 (y) > in view of the just established positive definiteness of A(p Y w{v))- But this gives 
y G W l \N 2 (l, 2), completing the proof for the case ai = 1 and a 2 = n. To prove the claim for 
the remaining values of a\ and a 2 we first show that N 2 (ai,a 2 ) is a Ann-null set: observe that 
#2(01,02) is the union of Ni(a 1: a 2 ) and {y G R n \JVi(ai,a 2 ) : det (X'A- x (p(y))X) = 0}. In view 
of Lemma 13.131 it hence suffices to show that the latter set is a Ann-null set. Using the relation 
D^ 1 = adj (D) J det (D) (with the convention that adj (D) = 1 if D is 1 X 1) and noting that 
det (A(p(y))) ^ for y G R n \^Vi (01,02) the set in question can be rewritten as 

A = {y G R n \iVi(ai, 02) : det (X 1 adj (A(p(y))) X) = 0} . 

Note that the equation in the set in the above display is polynomial in p{y). Upon multiplying the 

equation defining A by w2t=a ^tiv)) > wn i cn is non-zero on M. n \Ni(ai,a 2 ), where d = (n—l) 2 k, the 
set A is seen to be the intersection of JR n \ATi(oi, 02) with the zero-set of a multivariate polynomial 
in y. Hence, A is a AR^-null set provided we can establish that the polynomial is not identically zero. 
For this it suffices to find an y G M n \A r i(ai, a 2 ) such that det (A'A _1 (p(y))A) ^ 0: Set y = y 
where yo has been constructed in the proof of Lemma 13.131 Observe that p(yo) = Pyw(Vo) f° r the 
estimator p specified by a\ and a 2 and hence y G R n \Ni(ai, a 2 ) C M™\9Jl since \p Y w(Vo)\ < 1 
holds. But then det (A'A _1 ( / o(yo))A) 7^ holds because A{p YW {y)) is always positive definite 
(whenever it is defined) as has been established before. This shows that #2(01,02) is a A^n-nuil 
set. It remains to show that N 2 (ai,a 2 ) is a Ann-null set. For this it suffices to show that 

B={yGM n \iV 2 (o 1 ,a 2 ):a 2 (y) = 0} 

as well as 

C = {y G R n \N 2 ( ai , 02) : det (^(A'A-^yM)- 1 ^') = 0} 
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are A R n-null sets. Noting that det (A(p(y))) ^ as well as det(A'adj (A(p(y))) A) ^ hold for 
y £ M. n \N 2 (ai, a 2 ), the set B can be rewritten as 



B = 



{y £ R n \JV 2 (oi, 02) : det (A(p(j/))) [det (X' adj (A(p(y))) X)f a 2 (y) = Ct} 



Again the equation in the set in the above display is polynomial in y and p(y). Upon multiplying 

this by (Y^t=a "t (v)) > w hich is non-zero on R n \N 2 (ai, a 2 ), where d — (n ~ l) 2 (2fc + 1), one sees 
that -B is the intersection of R™\AT2(oi,02) with the zero-set of a multivariate polynomial in y. 
To establish that B is a ARn-null set it thus suffices to find an y £ M™\AT2(ai,a2) with a 2 (y) > 
0. Choose yo as above. Then we know that yo G M"\A r i(ai,a 2 ) and det (X'A~~ 1 (p(y ))X) ^ 
hold, i.e., y £ M. n \N 2 (ai,a 2 ). Furthermore, as shown before A(p(y )) is positive definite (since 
HPiVo)) = H"Pyw(Vo))) and y - XJ3(y ) ^ holds (since y £ ffi). Consequently, a 2 (y ) > 
holds. The proof for C is very similar. 

(3) Well-definedness is trivial and continuity follows from continuity of p on the open set 
IR ra \7Vo(ai,a2) (cf. Lemma [3.131) . Equivariance of (3 and a 2 is obvious, while the equivariance 
property of Cl follows from invariance of R n \JVo(ai,a2) and the equivariance of p established in 
(1). The third claim is obvious. Closedness of iVg (01,02) follows from the continuity property of 
p established in Lemma 13.131 To prove the final claim observe that Nq (01, 02) is the union of the 
A^n-null set iVo (01,02) with 

{y £ R n \N a ( ai ,a 2 ) : det {R(X' X)- l X' A(p{y))X{X' X)- X R') = 0} . 

Multiplying the equation defining this set by (X)™= u 2 {y)) q " , which is non-zero on W n \N Q (ai, a 2 ), 
one sees that the above set is the intersection of R"\iVo(oi, 02) with the zero-set of a multivariate 
polynomial in y. Again perusing yo constructed before shows that the polynomial is not identically 
zero, which then delivers the desired result. □ 



The following lemma is an immediate consequence of Lemma IB. II 

Lemma B.2. Suppose p satisfies Assumption^ and k < a 2 — Oi holds. Then ft and Cl satisfy 
Assumption [5| with N — -^2(01,02), and the set N* (cf. equation \2J$ ) is given by N 2 (ai,a 2 ). 
Similarly, (3 and Cl satisfy Assumption\5\with N = iVo(ai, 02) and the set N* is given by Nq (ai, 02). 
The sets A^ (01,02) and N 2 (ai, a 2 ) are invariant under the group of transformations y M> ay + A 7 
where a ^ 0, 7 6 M fe . 



Proof. The lemma except for the last claim follows from IB.ll The last claim then follows form 
Lemma IF. II in Appendix [FJ cf . also the discussion following Assumption [5] □ 

Lemma B.3. Suppose p satisfies Assumption [7] and k < a 2 — a\ holds. Then Cl and Cl satisfy 
Assumptions^ and^with N* = N 2 (ai,a 2 ) in case of Cl and with N* = iVg (01,02) in case of CI. 

Proof. Consider first the case of the Yule- Walker estimator, i.e., oi = 1 and 02 = n. Then 
A(py W (y)) is positive definite for every y <£ N (ai,a 2 ). Hence Cl(y) is positive definite for 
V t N 2 (ai,a 2 ) and CI (y) is positive definite for y <£ Nq (01,02)- Consequently, Assumptions 
[5] and [7] are clearly satisfied. Next consider the case where ai ^ I or a 2 =/= n. Then yo con- 
structed in the proof of Lemma [3.131 satisfies yo G R n \-ATo (01,02) as well as yo G R™\A r |(oi,a2) 
as shown in the proof of Lemma IB.ll Because of p(yo) = Pyw(Vo)j we also see that CI(jjq) as 
well as CKjjq) are positive definite (as the variance covariance estimators based on p coincide with 
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the ones based on the Yule- Walker estimator). This shows that Assumption [5] is satisfied for 
f2 and ft. It remains to establish Assumption [7j Let » / 0, u 6 R' be arbitrary. The pre- 
ceding argument has shown yo G M. n \N2 (01,02) and yo G R™\Aq (01,02) and also shows that 
v'il^ 1 (yo) v > and v'Q^ 1 (yo) v > hold. To complete the proof it suffices to show that the set 

ly<E R n \A 2 *(ai, a 2 ) : v'Cl" 1 (y) v — > is the intersection of y G R n \iVf (ai, 02) with the zero-set of 

a multivariate polynomial, and similarly for < y G R™\Aq (ai, a 2 ) : v'fl^ 1 (y)v = 0>. This is proved 
in a similar manner as in the proof of Lemma IB. II by rewriting all inverse matrices appearing in 
v'il^ 1 (y) v (v'Cl" 1 (y) v, respectively) in terms of the adjoints and determinants and observing that 
the determinants are all non-zero for y G R ra \A^2*(ai,a2) (y G R"\Aq*(oi, a 2 ), respectively). This 
shows that v'fl (y)v = (t/O (y) v = 0, respectively) can be rewritten as a polynomial equation 
in p(y). Multiplying this polynomial equation by a suitable power of YHt=a ^t (z/)> which is non- 
zero on R"\A^2 (01,02) (R ra \A^p(ai,a2), respectively) shows that these equations can be rewritten 
as polynomial equations in y. □ 

Proof of Theorem I3.14t We first verify the assumptions of Corollary 15.171 Assumption [5] 
is satisfied for /3 and tt (with N = N2(ai, 02) and N* — A 2 (01,02)) as well as for /3 and fl (with 
N = Ao(ai, 02) and N* — Nq (a\, 02)) in view of Lemma fB. 2 1 In view of Assumption [T] we conclude 
from Lemma [G.ll that Z + = span(e + ) as well as Z_ = span(e_) are concentration spaces of (£. 
Applying Parts 1 and 2 of Corollary 15.171 and Remark !5.18lT j to Z + as well as to 2L establishes 
(1) and (2) of the theorem as well as the corresponding parts of (4), if we also note that the size 
of the test can not be zero in view of Part 5 of Lemma 15.151 and Lemma IB.3I In order to prove 
(3) of the theorem, we apply Theorem 15.191 First note that (l satisfies Assumption [7] because of 
Lemma IB.3I Furthermore, choose as the sequence E m in that theorem E m = A (p, m ) for some 
sequence p m — > 1, p m G (—1,1). Then E = e + e + by Lemma TG. II which also provides the matrix D 
and its required properties. Hence I = 1 and span (E) = span(e + ) which is contained in 9Jt since 
e + G 971 has been assumed and 9JT is a linear space. Condition (f29|) in Theorem 15.191 is satisfied 
in view of the assumption Rj3(e + ) ^ since span (E) = span(e + ). Inspection of the constants 
K\ and K% in Theorem 15.191 reveal that K\ = K-i —: Kfgls (e+) since in the present case 7 is 
one-dimensional. That Kfgls (e+) depends only on the quantities given in the theorem is obvious 
from the formulas for K\ and Ki- Furthermore, if p = p Y w th en & is always positive definite 
on R"\A2(l,n) = R n \9Jt, because 1/VwH < 1 holds implying that A (pyw) 1S positive definite on 
R™\A r 2(l, n). Inspection of the constants K\ and Ki then reveals K\ = K2 = 1 in that case. The 
claims in (3) with e+ replaced by e_ are proved analogously, and so are the remaining claims in 
(4). ■ 

Proof of Proposition l3.15t (1) First consider %i : fgls ( e +)- The condition e + G A 2 x (01,02) 
is equivalent to e + G N\ t x (01,02), or e + G R n \iVi ] x(oi) 0-2) but det (A'A _1 (p x (e + ))A) = 0, or to 
e+ GR n \Ai ! x(ai,a2)anddet(A'A- 1 ( j o x (e + ))X) ^ Obut o 2 x {e + ) det (R(X' h- x {p x (e+))X)- l R') ■- 
0. The first one of these three conditions can be written as 

;« t (e + )« t _ 1 (e + )J = (f> t 2 (e + )j . (32) 

Since det (A' A) ^ holds for A G Xo, the set of X G Xo satisfying (j3"2")l is - after multiplication of 
both sides of (J32I) by the fourth power of det (A' A) - seen to be included in the zero-set of a multivari- 
ate polynomial in the variables Xu- Observing that det (A(p x (e+))) ^ and J2t=a ^t x( e +) ¥" 



for e+ E W n \Ni ! x(0'i, 02), the second one of the above conditions takes the equivalent form 

/ a 2 \ fe(«-i) 2 / „ \ 2 / a2 \ 2 

£fi?(e+) det(X'adj(A(p x (e+)))X) = 0, £u t (e + )« t _ 1 (e+) ^ E^ e +) ■ 

\t=Oi / \t=2 / \t=oi / 

(33) 
For I 6 lo satisfying the inequality in (|33|) . the left-hand side of the equation in the preceding 
display is easily seen to be a polynomial in the variables xu and u t (e+). Since det(X' X)u t (e+) is 
polynomial in the variables Xu and det(JCX) ^ for X G Xo, we may rewrite the equation in the 
preceding display by multiplying it by the 4k(n— l) 2 -th power of det(X'X). The resulting equivalent 
equation is obviously a polynomial in the variables xu ■ This shows that the set of X E Xo satisfying 
(|33|) is (a subset of) the zero-set of a multivariate polynomial. Recalling that det (A(p x (e + ))) ^ 
and E?= ai "?( e +) + ° for e + e R"Wi,x(ai,a 2 ), and that det(X'A- l (p x (e+))X) ^ implies 
det (X 1 adj (A(/? x (e+))) X) ^ 0, the third one of the above conditions takes the equivalent form 

/ a 2 \ (»-i) 2 (2fc+i+g(fc-i)) 

E ^ e +) / W ad J ( A (^(e+))) /(*)<?(*) = (34) 



\t— a\ 



subject to 



2«t(e + )« t _i(e + ) ^X>?(e+) ,det (X'A- 1 (p x (e+))X) # 0, (35) 



vt=2 / \t=ai 



where 



f(X) = [det(X'adj(A(p x (e + )))X))7„-Xadj(X'adj(A(p x (e + )))X)X'adj(A(p x (e+)))]e + 
5 (X) = dct(i?adj(A'adj(A(p x (e + )))X)i?'). 

The left-hand side of the equation in (|34j) is a polynomial in the variables xu as well as u t ,x{e+) 
for all X E Xo satisfying the inequality in (|35|) . After multiplying the left-hand side of the equation 
in (j3"4"| by a suitable power of det(X'X), which is non-zero for X E Xo, (|3"4"]) can be equivalently 
recast as an equation that is polynomial in Xu, showing that the set of X E Xo satisfying (|3"4")> 
and (|35[) is a subset of the zero-set of a multivariate polynomial. It follows that Xi^fgls (e+) 
is a Aj{«xfc-null set provided we can show that each of the three polynomials in the variables Xu 
mentioned before is not trivial. For this it certainly suffices to construct a matrix X E Xo such that 
e + ^ N£ x ( a i; a 2) holds: Consider first the case n > 3. Let the first column x'\ of X* be equal to 
(1,0,..., 0, 1)', and choose the remaining columns linearly independent in the orthogonal comple- 
ment of the space spanned by x\ and e+. Then X* E Xo holds and mj. (e + ) = (0,1,1,..., 1, 1, 0)' 
and hence p x » (e + ) is well-defined and equals Pywx* ( e +)i which is always less than 1 in absolute 
value. Consequently, e+ E M. n \Ni t x* (01,02) holds. Furthermore, A(p x ,(e + )) is then positive def- 
inite and hence det (X* 1 ' K- 1 {p x ,{e+))X*) / and det (R(X*' ' K- 1 {p x ,{e+))X*)~ 1 R') ^ hold; 
also a X t (e + ) > follows from positive definiteness of A(p x , (e+)) and the fact that e+ ^ span (X*). 
But this establishes e + E R n \iV"2^» (01,02) in case n > 3. Next consider the case n — 2. Then 
k = 1 must hold. The assumption /c < a 2 — a± entails a 2 = n = 2 and 01 = 1, i.e., p must be 
the Yule- Walker estimator implying that iV^^, (01,02) = span (AT*). Choose X* as an arbitrary 
vector linearly independent of e + (which is possible since n = 2 > 1 = k). Then X* E Xo and 
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e + G I"\JV 2 * ;( ,(«i,tt2) are satisfied. The proof for Xi^fgls (e_) is completely analogous where in 
case n > 3 the matrix X* is now chosen in such a way that x^ is equal to (—1, 0, . . . , 0, (—1)")' 
and e_ takes the role of e+ in the construction of the remaining columns. Next consider the set 
^2,FGLs(e+). Observe that for X G Xo\Xi, fgls (e+) the relation Tfgls,x{z+ + Mo) = C can 
equivalcntly be written as 

(RP x (e + ))'n x 1 (e + )(Rp x (e + )) - C = 0. (36) 

Similar arguments as above show that for X G Xo\£i, fgls (e+) this equation can equivalently be 
stated as p(X) — where p{X) is a polynomial in the variables xu- But this shows that the set of 
X G %o\%i, fgls ( e +) satisfying (|3"6"|) is (a subset of) an algebraic set. It follows that X 2 ,fgls ( e +) 
is a A R nxfc-null set provided the polynomial p is not trivial, or in other words that there exists a 
matrix X G %o\%i, fgls ( e +) that violates (|3"6"|) . But this is guaranteed by the provision in the 
theorem. The result for X2.FGLS ( e -) is proved in exactly the same manner. The remaining claims 
of Part 1 are now obvious. 

(2) Similar arguments as in the proof of Part 1 show that Xi,fgls (e_) and X2,fgls (?-) 
are each contained in an algebraic set. By the assumed provision it follows immediately that 
%2,fgls (e— ) is a A R nx(*-i)-null set. The same conclusion holds for Xi,fgls ( e ~) if we can find 
a matrix X* — (e+,X*\ such that e_ ^ JV 2 * It (01,02). To this end let the uxl vector a = 

(— 1, 0, . . . , 0, (—1)") be the first column of X* and choose the remaining k — 2 columns linearly 
independent in the orthogonal complement of the space spanned by e+, e_, and a (which is possible 
since k < n). Simple computation now shows that u x * (e~) 7^ (note that n > 4 has been assumed) 
and that the first and last entry of u x *(e-) is zero. Consequently, p x - (e_) is well-defined and 
equals Pywx* ( e -)i which is always less than 1 in absolute value, and the same argument as in the 
proof of Part 1 shows that e_ ^ JV 2 ' X , (01,02) is indeed satisfied. The remaining claims of Part 2 
are now obvious. 

(3) First consider %i,ols ( e +)- The condition e + G N£ x (01,02) is equivalent to Y^t=a "t ( e +) = 
°. or Et= 01 {i t( e +) ^ ° but dct (R(X' X^X' A(p{y))X{X' X)- 1 R') = 0. Similar arguments as in 
(1) then show that X\ : olS (e+) is a subset of an algebraic set. The matrix X* constructed in (1) is 
easily seen to satisfy e+ G K™\7Vg x , (ai,a 2 ). Thus Xi t oLS ( e +) is a A Rn xfc-null set. The proof for 
Xi,ols (e~) is exactly the same. Next consider X2,ols ( e +)- Observe that for X G Xq\X\_ols (e+) 
the relation Tols,x(&+ + Mo) = C can equivalently be written as 

{RP x {e + ))'n x \e + ){Rp x {e + )) - C = 0. (37) 

The same argument as in the proof of Part (1) shows that the set of X G Xq\X\.ols (e+) satisfying 
(I37|) is (a subset of) the zero-set of a multivariate polynomial in the variables xu- It follows 
that X2,ols ( e +) is a A R nx*-null set under the maintained provision that it is a proper subset of 
Xq\Xi^ols ( e +)- The proof for X2,ols ( e -) is the same. The proof for Xi^ols ( e -) an d ^2,ols ( e -) 
is similar to the proof for Xi : fgls (e_) and X.2,fgls (e-)- 

(4) Note that the assumptions obviously imply e + G 9JT and R$(e + ) 7^ 0. ■ 

Remark B.4. In case ai = 1 and 02 = n the argument in the above proof simplifies due to the 
fact that N% x (l,n) = N^ x (l,n) = span(X). 

Proof of Theorem l3?L6l We apply Theorem ET2T1 That 0, Q.) as well as 0, Cl) satisfy 
Assumptions [5j [6j and [7] has been shown in Lemmata IB.2I and IB. 31 The covariance model £aa(i) 
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satisfies the properties required in Theorem 15.211 as shown in Lemma IG.ll Furthermore, we have 
J (Car(i)) = span(e + )Uspan(e_), see Lemma fG.ll and because e+, e_ G 371 is assumed we conclude 
that J (£ar(i)) Q 971- The assumption R0(e + ) = i?/3(e_) = then implies that even J (£ar(i)) £ 
9Jt — Pa holds. The invariance condition (|3"TT) in Theorem l5.21l is thus satisfied, because T is G (97lo)- 
invariant by Lemma 15.151 We next show that the additional condition in Part 2 of Theorem 
15.211 is satisfied. This is trivial in case the Yule- Walker estimator is used (i.e., if Oj. = 1 and 
a n = n) since then £l(y) is positive definite for y ^ ./Vj (01,02) and Cl(y) is positive definite for 
y $. Nq (01, 02) (see the proof of Lemma fB.ip and since i\T|(ai, 02) and TVq (ai, 02) are Au»>-null sets 
by Lemma IB. II If a\ 7^ 1 or a n ^ n, then yo constructed in the proof of Lemma 13.131 satisfies 
t/0 G M"\A r 2 (ai,a 2 ) and yo & R™\A^g (01,02) (cf. proof of Lemma |B~TJ) as well as p(yo) = Pywivo)' 
implying that il(yo) as well as fl(yo) are positive definite. As shown in Lemma TB. II the matrix fl 
is, in particular, continuous on the open set R"\A r 2 * (fli, 02) and the matrix Q, is continuous on the 
open set R"\7Vq (ai, a 2 ). Consequently, f2 and Ct are positive definite in a neighborhood of ?/o and 
thus the additional condition in Part 2 of Theorem 15.211 is satisfied. Finally, the condition a\ = 1 
and 02 = n implies that f2 and Q are Ar» -almost everywhere positive definite (since then p — Pyw)i 
verifying the extra condition in Part 3 of Theorem 15.211 ■ 

C Appendix: Proofs for Section |4] 

Proof of Theorem 14. 2t First observe that $ and Clnet satisfy Assumptions [5] and [5] with N = 0. 
In fact, tlnet (y) is nonnegative definite for every y 6 R n , and is positive definite Ar™ -almost 
everywhere under Assumption [3] by Lemma 14. II Furthermore, in view of this lemma and because 
N = 0, the set N* in Corollary 15. 171 is precisely the set of y for which rank (B(y)) < q. It is trivial 
that Zi =span(ei (n)) is a concentration space of £ for every i = 1, . . . ,n. The theorem now follows 
by applying Corollary 15 . 1 71 and Remark (5THJi) to Zi and by noting that ti (n) S R n \N* translates 
into rank(i?(ei (n))) — q. Also note that the size of the test can not be zero in view of Part 5 of 
Lemma T5. 151 ■ 

D Appendix: Proofs for Subsection 15.11 

Proof of Proposition I5.2t Since 

Il (<n-v.) x (9<x,uy (y) ~v*) = aII ( « K _ I/>) - L (y - v) + U (m _ Vti) x (y 1 - v*) 

= an^,^)! (y - v) = all {m _ u ^± (y - i/») , 

invariance of h follows, and hence h is constant on the orbits of G(9T). Now suppose that h(y) = 
h{y'). If h(y) = h(y') = holds, it follows that 

U. (m _ Ut) ±{y - v*) = n (gi _ l/t) j.(y' - ^*) = 0. 

Consequently, y' — y is of the form ^* — z^* for some f* G 9T. But this gives y' — (y — v*) + v* = 
9i.v,,v* {y), showing that y' is in the same orbit as y. Next consider the case where h(y) = h(y') ^ 0. 
Then 



nprt-O-L ( n (m-u,) x (y' ~ »*) {v~v*)-c H (sn _ Vm) x (y - v*) (y' - v*)) 
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where c = ±1. It follows that the argument inside the projection operator is of the form v* — v* 
for some v* G 71. Elementary calculations give 



n 



(ot-i',) 



-W 



n 



(01 



- v *y(y- v *)\ 



-{y-v*) 



n 



(<n- I /*) x (y - u *) 



(V» - v*) 



Since the last term in parenthesis on the right-hand side above is obviously an element of 71, we 
have obtained y' = g{y) for some g G 0(71), i.e., y' is in the same orbit as y. This shows that h is 
a maximal invariant. ■ 

Proof of Proposition 15731 (1) From (fTT|) and its extension discussed subsequently to ([TTj) . as 
well as from the transformation theorem for integrals we obtain 



E, 



m,<t 2 * (<p (y)) = ^aCM-MoR/c^"** (f {a^^iv) 



By almost invariance of ip we have that <p (y) = tp I g a , (y) ) for all y G K n \N with Ar« (N) = 

(where N may depend on g~ l ,). Since $ is positive definite, also P a ( l j,- l i )+)j,' ,a 2 a 2 ^(^) = 
holds, and thus the right-hand side of the above display equals E a ^ fl _ fl )+ /J ', a 2 rT 2$ (ip(y)) which 
proves the first claim. The claim in parenthesis follows similarly observing that for invariant ip the 
exceptional set is empty. 

(2) Setting a = 1 in (|12[) shows that the rejection probability is invariant under addition of 
elements that belong to 2tt - Mo- Since M = n (OTo ^ o ) (/j, - (i ) + n (OTo _ Mo)i {/.i - fi ) + /j, we 



thus conclude that E^ a 2^{ip) = E„ +fioCr 2< s ,{(p) where v = II 



(OTo-Mo) 



.(fi — n ) G 9JL Now applying 



(fT2|) with a = er -1 and ^i = Mo to ^Wmo.^ 2 *^) establishes the first equality in (p~3|) . The second 
equality follows by the same argument by setting a = ±er -1 , the sign equaling the sign of the first 
non-zero component of v if v ^ 0, and the choice of sign being irrelevant if v = 0. 

(3) The first claim is an immediate consequence of (J13I) . For the second claim it suffices to show 
that fi — X0 rest {/j,) (for (i G 7R) is an injective linear function of R/3 — r, bijectivity of this mapping 
following from dimension considerations. To this end note that 



(j,-X0 rest (p) 



X/3-X 



Xfi-X 



ja(p) - {x'xy 1 r' (r {x'xy 1 r' 



Rfi{ji) 



/3-{X'X) l R'(R(X'X) l R!\ {R(3 



{x'xy 1 r'(r {x'xy 1 r 1 ) {r/3 



and that the matrix premultiplying R/3 — r is of full column rank q. ■ 

Proof of Proposition 15. 6t Set h(fj,,a 2 ) = (iT^ _ ^±{fi — fi )/a\. The invariance of 

(h(n, a 2 ), EJ follows from a simple computation similar to the one in the proof of Proposition 
15.21 Now assume that (h(fj,, a 2 ), S) = (h{fx', a' 2 ), £'). We immediately get h(fi,a 2 ) — h(fi' ' ,a' 2 ) 
and £ = £'. The former implies 



n 



(ano-Mo)-" ^ - Mo) - c(<r/V) (m' - Mo)) = ° 
where c = ±1. Similar calculations as in the proof of Proposition 15.21 give 

// =c(cr7o-)(M-M*)+Mo 
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for some /i* G 9JTo- Together with £ = £' this shows that (//, cr' 2 , £') is in the same orbit under 
the associated group as is (/i, er 2 , £). ■ 

E Appendix: Proofs and Auxiliary Results for Subsections 

1^21 and 1531 

The next lemma is a simple consequence of a continuity property of the characteristic function of 
a multivariate Gaussian probability measure and of the Portmanteau theorem. 

Lemma E.l. Let $ m be a sequence of nonnegative definite symmetric nxn matrices converging to 
an n x n matrix $ as m — » oo, where $ may be singular, and let /i m G M" be a sequence converging 
to fi G R™ as m — > oo. TTien P^ ^ m converges weakly to P^,$. I/, in addition, A G 23(R") satisfies 
A M+S pan(*)(bd(A)) = 0, i/ien P M ™,$ m (,4) ->• P^,*^). 

Proof of Theorem 15. 7\ (1) Since Z is a concentration space of £, there exists a sequence 
(£ m ) mG N in £ converging to £ such that span(£) = Z. Note that /i + Z is a Ann-null set because 
dim(Z) < n in view of Definition 12.11 Because £ m is positive definite, we thus have 

P„ ,^ m (WO = ^o- 2 s m (W U ( Mo + Z)). 

By Lemma IeTTI we then have that P M0!O .2 Sm (WU(/4 + Z)) converges to P^ l<7 ^^(WU(n + Z)). But 
the later probability is not less than P ff 2j(/i + Z) which equals 1 since P„ CT 2£ is supported by 
^o+Z. To prove the claim in parentheses observe that T(fj, + z) > C and lower semicontinuity of T 
at /x + z implies that T(w) > C holds for all w in a neighborhood of /i + z; hence such points fJ. a + z 
belong to int(W) C int (W U (fj, + Z)), and consequently do not belong to bd (W U (ji q + Z)). But 
this establishes (fl5|). 

(2) Apply the same argument as above to IR ra \VF. Also note that P^ i(T 2e(W / ) can be approx- 
imated arbitrarily closely by Pfj, ^^(W) for suitable [i x G 9Jti, since H-P^ ff 2 £ ~ ^^o^sIItv — > 
for //j — > // holds by Scheffe's Lemma as <7 2 £ is positive definite. 

(3) Choose an arbitrary /^ G 9Ki. By assumption we have inf P^ 0i<T 2 S (W / ) = for a suitable 

a 2 > 0. It hence suffices to show that for every £ G €. 

holds for r — >• oo. By almost invariance of W under G ({^ }) we nave that MK A (tW + (1 — t)/z ) 
is a Art. -null set. Hence, by the reproductive property of the normal distribution 

P^ a 2 T 2^{W) = P Ml)CT 2 T2s (rW + (1 - r)A»o) = P Mo + T -i(^ 1 - Mo ) )CT 2 S (W). 

But, since <r 2 £ is positive definite, we have by an application of Scheffe's Lemma 

as r — > oo, and hence P Alo , C r2 S (M / ) — P^ + T - !(,« _ M ) j0 -2j;(W) — > 0. The claim in parenthesis is 
obvious. I 
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Lemma E.2. Let tp : W 1 — » [0, 1] be a B or el-measurable function that is almost invariant under 
G(9JTo)- Suppose $ m is a sequence of positive definite symmetric n x n matrices converging to a 
positive definite matrix $, suppose fj, m G 9Jt, and suppose the sequence a^ satisfies < cr^ < oo. 
Then 

llm ^ m ,tr= *m(^) = ^+Mo.*(¥') 
771—^00 m m 

provided v* m — IL OT ^(Am — l JL n)l°~m f or some zi G 9Jl converges to an element v G R n (which 
then necessarily belongs to 971 j. [Note that v* m , and thus the result, does not depend on the choice 
ofn eM .] 

Proof. By Proposition 15.41 we have that E^ )CT 2 $ m (<£>) = £ ; I/ ^+ A ( 0) $ m (y). Since z^*„ — >• i/ and since 
$ m — > $, with $ positive definite, the result follows from total variation distance convergence of 

Remark E.3. (i) Consider the case where v* m = IL OTq vi(/i m — n )/cr m does not converge. 
Then, as long as the sequence v* m is bounded, the above lemma can be applied by passing to sub- 
sequences along which v* m converges. In the case where the sequence v* m is unbounded, then, along 
subsequences such that the norm of v* m diverges, one would expect E„ a i $ m (y>) — E v * + Mo ,* m (</?) 
to converge to 1 for any reasonable test since v* m + /x moves farther and farther away from 97to 
(and <& m stabilizes at a positive definite matrix). Indeed, such a result can be shown for a large 
class of tests, see Lemma \5. 151 

(ii) In the special case where n m = /J. it is easy to see, using Proposition l5.4l that the limit in the 
above lemma is E^ a 2$(p) if a 2 m — >• a 2 G (0, oo) and /z G SDti, is E^ *>((p) if er m — > oo and // G SPti, 
and is £" Mj $(v?) if /z G QJto- 

Lemma E.4. Let ip : M. n — > [0, 1] 6e a B or el-measurable function that is almost invariant under 
G(%JIq). Suppose $ m is a sequence of positive definite symmetric n x n matrices converging to a 
singular matrix $, suppose jj, m G 9Jt, and <r m is a sequence satisfying < cr^ < oo. Assume further 
that ip(x + z) = ip(x) holds for every x G K™ and every z G span(<I>). Suppose that for some sequence 
of positive real numbers s m the matrix D m = Tl span /$\±<& m IL span /$\± / s m converges to a matrix D, 
which is regular on the orthogonal complement o/span($). Then 

lim \.<*Jf) = E v+tl D+ <s,{tp) = E v+ll d(<p) 

provided v*^ = IL m _ ,i (/i m — /x )/ I c m Sm ) /or some /z G 9Ho converges to an element v G 1" 
(which then necessarily belongs to 9Jtj. [Note that v*^, and thus the result, does not depend on the 
choice of /i G 9JTo-/ Furthermore, the matrix D + $ is positive definite. 

Proof. Because II span ($) (x — /i TO ) G span($), we obtain by the assumed invariance w.r.t. addition 
of z G span($) 

cp(x) = ip(fx m + n span(<i)) ± (x - /i m ) + n span( <j,) (a; - Mm)) = ^(Mm + n span(<i) )± (x - n m )) 

for every x. By the transformation theorem we then have on the one hand 

E H m >°* m * m kP(')) = £ P m ,^* m (^(Mm + n span(<E.)-L G ~ A*™))) = ^^H^^j. * m n, pB(#)i (<?(')) 

= S/. m ,^« m D m (¥>(•))• ( 38 ) 
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On the other hand, by the same invariance property of ip 

£ f ml ^»mfl„(*'(')) = E ^ m M 2 m s m D m { l fi(- + Z)) 

holds for every z € span($). Integrating this w.r.t. a normal distribution Pq^ Sm §, (in the variable 
z) and using the reproductive property of the normal distribution gives 

^^i^fljft 1 )) = E Q m {(f{x + z)) = E ii m , l j 2 mSm (D m +4>){ l P( x )) (39) 

where Q m denotes the product of the normal distributions P a a 2 Sm o m an d Pq,<j 2 s,„*- Observe 
that D + $ as well as D m + $ are positive definite. An application of Lemma [E.2I now gives 



lim Ea m .a 2 s m (D m +S>){v) = E„ +n D+i j>{f>). 

The same argument that has led to (|39|) now shows that E v+Uo _D + ^{f) — E u+fl u(cp). Combining 
this with (I55|) completes the proof of the display in the theorem. The positive definiteness of D + <fr 
is obvious as noted earlier in the proof. □ 

Remark E.5. (i) A remark similar to Remark lE.Sf i) also applies here. In particular, we typically 
can expect E^ „i $ m {if) to converge to 1 in case the norm of v™ diverges. 

(ii) In the special case where fi m = p, it is easy to see, using Proposition l5.41 that the limit in the 
above lemma is £^ )K (£> + $) {if) = E^ KD (<p) if a m s m -> k e (0, oo) and \i € Tti, is -E^ ,(D+$)(^) = 
En ,D{f) if a 2 m s m -> oo and /z e £Dti, and is £^ M ,(z3+*)(^) = E^ofo) if M e ^o- 

Remark E.6. (i) If s m and sj„ are two positive scaling factors such that n span /$)x ^mlljpW*)! / s m — 
D and n spajn (<s>)±$mTlspa,n('f>) ± / s m ~* E>* with both D and -D* being regular on the orthogonal com- 
plement of span($), then s m / s^ must converge to a positive finite number, i.e., the scaling sequence 
is essentially uniquely determined. 

(ii) Typical choices for s m are s m = ||n spaa ($)i$ m ri span (5,)x|| (for some choice of norm) or 

s m — tr(n span (0)j_$ m n spall ($\x); note that si as well as s m are positive, since $ m is positive 
definite and $ is singular. With both choices convergence oi H span ^±^> m Il span ^± / s m (at least 
along suitable subsequences) is automatic. Furthermore, since for any choice of norm we have 
ci ||n span ($)j-$ TO n span (3,)x|| < tr(n span ( $ )x<& m n span ( $ )x) < C2||n span ( $) x$ m II span( - $ )x|| for suit- 
able < c\ < C2 < oo, we have convergence of s m / 'sin to a positive finite number (at least along 
suitable subsequences) . Hence, which of the normalization factors sJJ is used in an application of 
the above lemma, typically does not make a difference. 

Proof of Theorem l5.10t (I) By the invariance properties of the rejection probability expressed 
in Proposition [52] it suffices to show for an arbitrary fixed fi G 9Jto that 

supE u E (<p) < I 

in order to establish the first claim in Part 1. To this end let E m e £ be a sequence such that 
E^ .T, m {f>) converges to sup EG£ E nQ ^{f). Since € is assumed to be bounded, we may assume 
without loss of generality that E m converges to a matrix E (not necessarily in <t). If E is positive 
definite, it follows from Lemma fE.21 applied to E^ o ^ m {f) that sup Se£ i5 /ioi E(</5) = E^ ^(f) (since 
v = 0). But E u ^(f) is less than I since if < 1 is not Ar™ -almost everywhere equal to 1. If E is 
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singular, then in view of the assumptions of the theorem we can pass to the subsequence E mi and 
then apply Lemma IB~4l to £^ ,£ m .(y>) to obtain that sup- Se< rE flg ^(tp) = E^ aD+ f : (p) (since again 
v = 0) for a matrix D with the properties as given in the theorem. But E D+fiif) 1S l ess than 
1, since D + E is positive definite (as noted in Lemma IE.4[) and since <p < 1 is not Ar>i -almost 
everywhere equal to 1. This proves the first claim of Part 1 of the theorem. To prove the second 
claim in Part 1, observe that for the same invariance reasons it suffices to show that for an arbitrary 
fixed fi Q 6 DJl 

inf £ Mo , S (¥>) > 

sec ° 

holds. Now the same argument as before shows that this infimum either equals E f^(p) for some 
positive definite E, or equals E^ .£>+e(<^) for some positive definite D + E. Since ip > 0, but (p is 
not ARn-almost everywhere equal to by assumption, the result follows. 

(2) Let fx m s Tlx, < cr^ < oo, and E m S £ be sequences such that E^^^i ^ m {p) converges 
to inf^^gjjj inf CT 2 >0 infsgc E ^^ a iY.{.P) ■ Since £ is assumed to be bounded, we may assume without 
loss of generality that E m converges to a matrix E. 

Consider first the case where E is positive definite: Set v* m — IL OTo >i(/i m — /i )/<7 m . If 
this sequence is bounded, we may pass to a subsequence ml such that v* m , converges to some v. 
Applying Lemma lE.21 then shows that E , a i s \(p) converges to E ll+ ^(p), which is positive 

since ip > is not ARn-almost everywhere equal to and since E is positive definite. If the sequence 
v* m is unbounded, we may pass to a subsequence ml such that ||f£,/ 1| — > oo. Since En , a i s , (ip) = 
E v * ,+fj, ,s / (y) by Proposition 15. 4[ it follows from assumption (|19[) that lim m < _E , a i s ; (<^) is 
positive. 

Next consider the case where E is singular: Pass to the subsequence mi mentioned in the theorem 

and set now v** i = LL OTo )±{p- mi — Mo)/ UmiSmj )• If this sequence is bounded, we may pass 
to a subsequence ml i of m, such that v**, converges to some v. Applying Lemma IE. 41 then shows 
that E„ , CT 2 s , (ip) converges to E u+ 0+5(^)7 which is positive since ip > is not AR«-almost 

everywhere equal to and since D + E is positive definite. If the sequence v*^. is unbounded, we 
may pass to a subsequence ml i of m, such that v** x , — » 00. Since 

by (1551) . (HHJ), and Proposition l5.4[ it follows from assumption (|TT))) and positive definiteness of -D + E 
that lim^oo E^ , ,p, s , (y>) is positive. Taken together the preceding arguments establish Part 2 

of the theorem. 

(3) To prove the first claim of Part 3 of the theorem observe that we can find \x m € 9Ki and a^ 
with < er^ < 00 with d (// m , OJlo) /cr m > c such that the expression left of the arrow in (|2U)) differs 
from En a 2 ■ Sm (p) only by a sequence converging to zero. Let ml denote an arbitrary subsequence. 
We can then find a further subsequence ml i such that the corresponding matrix D m < satisfies the 
assumptions of the theorem. Note that the sequence s m ' corresponding to D m i necessarily converges 

to zero. But then the norm of v*^, defined above must diverge since d ( /x m / , OJIq ) j a m > c and since 
ILgjj _ n_l is the projection onto the orthogonal complement of 9Ho — fJ. - Because 
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in view of (|55|) , (|3T)1) . and Proposition 15. 4[ the result then follows from the assumption that the 
limit inferior in (1191) is equal to 1, noting that _D m < + S is positive definite and converges to the 
positive definite matrix D + S. 

We next prove the second claim in Part 3. Choose /i m G 9Jli with d (/i m , 9Jto) > Cm such that the 
expression to the left of the arrow in (|2"Tj) differs from E^ „i % m (ip) only by a sequence converging 
to zero. Since 

by Proposition 15.41 where i/J^ was defined above and since ||f^|| ^ c m /a m — > oo clearly holds, the 

result follows from the assumption that the limit inferior in (|19|) is equal to 1. [Note that we have 

not made use of condition (fT5|) and the condition on £ following (TT5)) .] ■ 

Proof of Theorem l5.12t By invariance properties of the rejection probability (cf. Proposition 

I5.4|) it suffices to show for the particular /Iq 6 9Jlo appearing in (f23|) that for every 5, < 5 < 1, 

there exists fco = ko(S) such that 

S npE^^ ko )<S. (40) 

see 

For this it suffices to show that sup E^^i^k) converges to zero for k — » oo. Let St e C be a 

see: 
sequence such that for all k > 1 

sup^,^) ^..EjVfcH*- 1 . (41) 

see: 

Since C is assumed to be bounded, we can find for every subsequence k* a further subsubsequence 
k' such that £&/ converges to a matrix £ (not necessarily in £). Let £ > be given. We then 
distinguish two cases: 

Case 1: £ is positive definite. By (|23|) we can then find a k' Q in the subsequence such that 

^*,s(^)<£/2 
holds. But then by (j4"Tj) and by the monotonicity expressed in (|23|) 



sup^.s^,) <^,s„(^) + fc' _1 <^S,s„(^) + fc'- a (42) 

see: 



holds for all fc' > k' . Now Lemma TE.2I together with Remark IE. 3IT ii^ may clearly be applied to the 
subsequence k', showing that E^ t s k i {fy ) converges to E * Y,{fy ) < s/2. But this shows that 

limsupsup-E^.s^,) < e. (43) 

fe'->oo see: 

Case 2: £ is singular. Then we can find a subsequence k[ of k' and normalization constants 
Sfc< such that the resulting matrices D k i converge to a matrix D with the properties specified 
in Theorem 15.101 Because D + £ is positive definite, we can in view of (|23p find a fc^Q, in the 
subsequence fc£ such that 

E^ D+s (<p k ,J<e/2. 

Now applying Lemma lE.4l together with Remark[E3Jii) to the subsequence k\ shows that E^ ; s , {<p k , ) 
converges to ^ s , D+ s(^ (0) ) < e/2. But by tfHJ and © 

sup ^,s(^<) < ^ 5 ,s fc , ( Vfc j) + k'r 1 < E^ k , (*>*>) + k'r 1 
see: * * ( ' 
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holds for all i > i(0). This shows that 



limsupsup-E^* s(Vfc<) < £■ 

i->oo sec 



Taken together we have shown that sup E^* ,s(y>fc) mus t converge to zero along the original sequence 

see °' 



k which proves (|40|) . ■ 

F Appendix: Proofs and Auxiliary Results for Subsection 

EI 

Lemma F.l. Suppose Assumption^ holds. Then the sets 

M = {y € M. n \N : det Q(y) = 0} and A 2 = {y G R n \N : det ft(y) ^ 0} 
are invariant under G(9Jt), the former set being closed in the relative topology on W l \N . The set 

N* = NLl{y e R n \N : det Q(y) = 0} 
is a closed X^-null set in R n that is invariant under G(9Jt). 

Proof. The invariance of Ax and A 2 follows immediately from the invariance of R n \iV and the 
equivariance of Cl(y). The relative closedness of Ai is an immediate consequence of the continuity 
of Cl(y) on M. n \N. The invariance of N* follows from invariance of N discussed after Assumption 
[5] and the just established invariance of A±. Because N is a A^n-null set and because Cl(y) is 
A^n-almost surely nonsingular on WL n \N, it follows that N* is a A^-null set. Finally, we establish 
closedness of N*: let yi G N* be a sequence with limit yo- If 2/o £ N, we are done. If yo G M. n \N, 
by openness of this set also yi G M. n \N for all but finitely many i must hold and thus det 0(j/j) = 0. 
But then continuity of fi on M. n \N implies det Cl(y ) = 0, and hence t/o £ N*. D 



Proof of Lemma I5.15t (1) Follows from the discussion preceding the lemma and Lemma fF. 11 

(2) Follows immediately from the observation that T coincides on the open set M. n \N* with 
(R/3(y) — ryCl^ 1 (y)(RP(y) — r) which is continuous on this set by Assumption [5j 

(3) Since N* is invariant under the elements of G(9Jt), it is in particular invariant under G(SDTo)- 
The result T(g(y)) = T (y) = for g G G(M ) then follows trivially for y G N* . Now suppose 
y G R n \N*. Then also g w m (y) G R n \N* for a ^ 0, /A° G M Q (i = 1,2) by invariance of 

M. n \N* . The invariance of T then follows immediately from the equivariance properties of (3 and 
Cl expressed in Assumption \E\ using that /Uq G 2to implies RyW = r for uniquely defined vectors 
7W satisfying /Zq = .X7W. 

(4) Set O = {y G K™ : T(y) = G} and note that O C M rl \iV* since G > by assumption. Wc 
can then write 

0= |J ({ 2 / 1 GOT:y 1 +y 2 e]R n W*,T(y 1 + y 2 )=G} + y 2 )= (J (0(j/ 2 ) + 2/2) • 
s^es?- 1 - i/2Gaji J - 

Note that O as well as 0(y 2 ) are clearly measurable sets. By the already established invariance of 
M. n \N*, the fact that R n \N* C M. n \N, and by the equivariance properties of /3 and fi maintained 
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in Assumption [5j the set 0(2/2) equals 

yi €3Jt:(R (£(ite) + (X'xy 1 X' yi ) - r) ' ft- 1 ^) (r (p(y 2 ) + (X'xy 1 A'j/i) -r)=C 

if j/2 e (R n \iV*) nm^, and it is empty if y 2 e N* nM 1 - (since C > 0). If y 2 G (W l \N*) nSTtf 1 -, the 
set 0(2/2) Q 2Jt is the image of 

0(2/ 2 ) = {7 e R fc : (fl (£(ife) + 7) - r)' 0- 1 (2/ 2 ) (i? (ftya) + 7) - r) = o} 

under the invertible linear map 7 i-> X7 from R fe onto 9JT. Now 0(2/2) is the zero-set of a multivariate 
real polynomial (in the components of 7). The polynomial does not vanish everywhere on R fc 
because the quadratic form making up the polynomial is unbounded on R fe (because f2 — 1 (j/2) is 
symmetric and well-defined if 2/2 G R"\iV* and because rank(i?) = q holds). Consequently, 0(2/2) 
has fc-dimensional Lebesgue measure zero and hence X<m(0(y 2 )) — for every 2/2 G (R n \iV*) nSJl- 1 . 
We conclude that Xm(0(y 2 )) = for every 2/2 £ 971^. 

We now identify R" with 9Jt x 9JI- 1 and view Lebesgue measure Ar™ on R" as A^ ® A OT ± . Hence, 
2/ is identified with (2/1,2/2) G 9Jt x 9JT satisfying 2/ = 2/1 + 2/2- Fubini's Theorem then shows 

A R "(0) = A OTi8)OT _l (O) = / lo((2/i ) 2/2))rfA OTxOT i(2/i,2/2) 

OTxOT- 1 - 



1 o(y 2 )(yi)dXm(yi)d\ m ±(y2) = / Aart(0(2/ 2 ))dA OT i(2/2) = 0. 
SOT X 331 m ± 

(5&6) First observe that {y G R n \N* : T(y) > C} = {y G R n : T(y) > C} holds in view of C > 
and the definition of T. By continuity of T on R"\X* established in Part 2 and by openness of 
R n \N*, the openness of {y G R n \N* : T(y) > C} and {y € M"\N* : T{y) < C} follows. It hence 
suffices to show that these two sets are non-empty: Choose an arbitrary y £ W l \N* and set 
2,(7) = y + Xj for 7 G R fc . Then 2/(7) G R n \N* by invariance of R n \iV* under G(M). Now by the 
equivariance properties of ft and Q expressed in Assumption [5] 

T(j/(7)) - (ifry + ^(2/) - r)'^- 1 ^) (i? 7 + fifty) - r) . 

Define 7 = /3 — /3(y) for some /? satisfying R(3 = r. Then T(y(j)) — < C holds showing that 
{y G R"\A* : T(y) < C} is non-empty. Finally choose y G R ra \iV* and v as in Assumption El 
Choose 5 such that u = RS. Then set 7 = cS + (3 — j3(y) where j3 is as before and c is a real number. 
Observe that then T(y(j)) = c 2 v'u 1 (y)v. Choosing c sufficiently large shows that T(y(j)) > C 
can be achieved, establishing that {y G M. n \N* : T(y) > C} is non-empty. 
(7) Let G be a standard normal nxl random vector. Then 

Pu m +Mo,* m (W(C)) = Pr (T(u m + n + $,i/ 2 G) -C>0). (44) 



Set 7 TO = (X'X) 1 X'u m and 7 = (X'X) 1 X'fj, . Observe that Rj Q = r while ||i?7 m || ->■ 00 

1/2 
as to — > 00 in view of v m G IL 9J! _ nj-(£DTi — /i ) an( i ll^ml — > 00. For $m G G R"\A* (an 
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event which has probability 1 because N* is a AR^-null set and $ m is positive-definite) we may use 
equivariance of (3 and £7 and obtain that T(v m + /i + $m G) — C coincides on this event with 

{R lm +RH<$]l 2 G))'n- 1 (<5>]l 2 G)(R lm + Rh^G)) - C. (45) 

Observe that <&„{ G — >$ 1 ' 2 G as to — »■ oo with probability 1. Furthermore, /3 and il" 1 are contin- 
uous on R n \iV*, a set which has probability 1 under the law of $ 1 / 2 G (since N* is a A^-null set 

1/2 

and $ is positive-definite). From the continuous mapping theorem we conclude that R(3(&m G) 
and fi~ 1 ($, 1 „ G) converge almost surely to R~P(<& l / 2 G) and fi~ 1 ($ 1 / 2 G), respectively. Now let 
v E A((v m ) m>l ) and let rrii be a subsequence such that ||i?7 m . || Rl„ H — > v- It follows that 



(i?7 mi +^(< 2 G))'fi- 1 (< 2 G)(i?7„ il +^(< 2 G)) - c] / ||-R 7m| 



,2 



converges to 

w'ft- 1 (* 1 / 2 G)v 

with probability 1. Since Pr (w'J7~ 1 ($ 1/2 G)w = 0) by Assumption El it follows that 

Pr (T(u mi + Mo + «^{ 2 G) - C > o) -► Pr (u'^- 1 ($ 1 / 2 G)w > o) . 
This shows that 

liminf f^+^.a^W^C)) < liminf P„ m . +|loi * m .(W(C)) 

= Pr(w'f2- 1 ($ 1/2 G)w>o), 
implying that 

liminf P v + „ n ^ m (W(C)) < inf Pr (VflrW^GJv > o) . 

m^oo m "' veA((v rrl ) m>1 ) \ I 

Conversely, let rrii be a subsequence such that 

Pv m +^ m {W{C)) -► liminf P„ m+/lo ,« ra (W(C)). 



R 7 mi (j) 



i?7,„ converges to some 



Since the unit ball in R 9 is compact, we may assume that 

v 6 A({v m ) m>l ) along a suitable subsequence rn^ . The same arguments as above then shows 
that 



liminf P„ m+li0 ,* m {W{C)) = liminf P„ ., +Mo ,* m ., ., (W(C)) 



= Prfw'0- 1 ($ 1/2 G)w >0J 

> inf Pr fu'f2- 1 ($ 1/2 G)w > o) . 

OSA(("m) m M) V ■ 7 

Given Assumption [3 the remaining equalities and inequalities in (|26[) and (|2T[) are now obvious. 
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Proof of Corollary [5TT71 (1) If z £ W l \N* then n + z £ R n \N* and T is continuous at 
/i + z for every fi £ iVlo by Parts 1 and 2 of Lemma [5. 151 If T(^g + z) > C holds, then by the 
invariance of T established in Part 3 of Lemma T5. 151 we have T(/i + z) = T(/Iq + z) > C for every 
/i e 9^o- Hence the sufficient conditions in Part 1 of Theorem 15 . 71 are satisfied and an application 
of this theorem delivers the result. 

(2) Completely analogous to the proof of (1) noting that the invariance of T required in Part 3 
of Theorem 15.71 is clearly satisfied. 

(3) Since N* is a Ann-null set the test statistic T is Art. -almost surely equal to the test statistic 

T*( )=l T{y) V €Rn \ N *' 
\oo, y£N*. 

We verify that the sufficient conditions in Part 1 of Thcorem l5.7l are satisfied for T*. To that end fix 
fi £ Mo and let Z' C Z denote the set of all z such that z £ R n \N, Q(z) = 0, and RJ3(z) ^ hold. 
By invariance of N (cf. discussion after Assumption [5]) and equivariance of O we see that z € Z' 
implies /i + z £ W l \N and Cl(fj, +z) = 0, and thus T*(/z + z) = oo > C holds for every z € Z' by 
definition of T*. We next show that T* is lower semicontinuous at fi Q + z for every z £ Z' . Let y m 
be a sequence converging to /i + z. Since M. n \N is open, we may assume that this sequence entirely 
belongs to M. n \N. If detCl(y m ) = eventually holds, we are done since then T*(y m ) = oo eventually 
by construction. By a standard subsequence argument we may thus assume that detO(y m ) > 
eventually holds since Ct is nonnegative definite on W l \N by assumption. Now note that 

T*(y m ) = T(y m ) = (R0(y m ) - r)' n{y m )- x {R{3{ym) -r)> X^ ax (n(y m ))\\Rp(y m ) - r\\. 

Since /? is continuous on M. n \N by assumption, we have Rj3(y m ) —> RJ3(n + z) = Rp(z) +r^ 
r where we have made use of equivariance of J3(z) and of /i £ HRq. Hence \\Rj3(y m ) —r\\ — » 
\\Rp(z)\\ > 0. Furthermore, fi is continuous on M. n \N by assumption, hence £l(y m ) — > tl(fi +z) = 0. 
Consequently, T*(y m ) — ► oo, establishing lower semicontinuity of T*. We may now apply Part 1 of 
Theorem 15.71 together with Remark I5.8f i) to conclude the proof. ■ 

Lemma F.2. Let f3 and fl satisfy Assumption^ let T be the test statistic defined in i25\) . and 
let W(C) = {y £ K™ : T(y) > C} with < C < oo be the rejection region. Let $ m be symmet- 
ric positive definite n x n matrices such that $ m —¥ $ for m — > oo where $ is singular with 
I := dimspan($) > 0. Suppose that for some sequence of positive real numbers s m the matrix 
D m — n s p an ($\i$ m n span ($-)i/s m converges to a matrix D, which is regular on span($)- L , and 

1/2 

that n span / $ )i<I > m n s p an (,£) / Sm — > 0. Suppose further that span((f>) C 9Jt. Let Z be a matrix, the 
columns of which form a basis for span($) and let G be a standard normal n-vector. Then: 

1. For every /i £ 9JTo, 7 G K > < a < oo we have 

s m [T ( Mo + Z 1 + o-$U 2 G) ~ C] A £ ( 7 , a) 
for m — > oo where the random variable £ (7, a) is given by 

(r$ (V 1 z 1 + $ 1/2 g)Y rr 1 (U 1/2 + d 1/2 ] g) (rp(o- 1 z 1 + <$> 1/2 g 

for ('$ 1 ' 2 + D 1 ' 2 ) G (jz N* , which is an event that has probability 1 under the law of G, anc 
where £ (7, a) — else. 

GG 



2. If additionally Assumption^holds and 

Rf3(z)^Q A span( $)-a.e. 
is satisfied, then 

P l * +*r,°>* m ( W (°)) = Pr ( T (^0 + Zl + <t&J?G) > C) -> Pr (£ (7, a) > 0) 

as m — >• 00. 

Proof. (1) Observe that /i + Z7 e 971, that the columns of $ 1//2 as well of n span ($)$m belong to 
971, and that M. n \N* is invariant under the group G(97t). Hence, using the equivariance properties 

of p and Cl expressed in Assumption [5] repeatedly, we obtain that on the event < $m G S R n \JV* > 
^ (/x + Z 7 + (7^ 2 g) - r = R(3^ + Z 1 + an span($) $,i{ 2 G + <7ll spall($)i $i/ 2 g) - r 

= R (BZ 1 + (rBn span($) $i/ 2 G + asll 2 ? ( Sm 1/2 n s pan($)^ 2 G 

= oR (g- 1 BZ 1 + K m + s)l 2 p (i, 



holds, where B is shorthand for {X'X)~ X X', K m = B (n span(< j,)$^ /2 - 4( 2 $ 1/2 ) G, and L 
$1/2(5 + Sm n span ($\j_<I>„{ G). Similarly, we obtain 

A (mo + ^7 + ^ /2 g) - ^ 2 fi (*J/ 2 g) = a 2 ^ (n span($) $^ 2 G + n span($) ^/ 2 G) 

= a 2 fi(n span($)i $V2 G ) =a 2 Sm f2( J L m ) 

on the event < <&„{ G £ R™\iV* >. Hence, on this event we have 

s m [T (mo + ^7 + ^*,\{ 2 G) - CJ = (i? (cr- 1 ^^ + K m + s^fS (Z m )) ) ' ft" 1 (£ m ) 

xi? (a" 1 ^ + K m + s]l 2 "l3 (L m )) - s m C. 
Clearly, Jf m and L m are jointly normal with mean zero and second moments given by 

E (K m K' m ) = b (n spanW <i>v 2 _ s v 2 $i/2) (n spaa(#) $v 2 - s)l 2 ^/ 2 ) B', 



E (L m L' m ) = $ + D n 



and 



e (K m L' m ) = b (n span($) $v 2 - 4( 2 * 1/2 ) ($ 1/2 + C /2 n span( *)^ r 1 ,/ 2 )' . 



It is easy to see that E [K m K' m J converges to B&B' because s m — > 0, while E {L m L' n ^ converges 
to $ + D. Furthermore, E [K m L' m ^j converges to B& because 



b (n span(#) $v 2 - sll 2 ^/ 2 ) (s-wu^w^U 2 )' 

b fn span($)i a> m n span($) /4/ 2 V - b^ 2 ^ 2 il. 



span($) i ^m 11 span($)/ b m j °^ ^m il span($) i 

-B$n span($) i = -b (n span($) i$) = 
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1 /2 

where we have made use of the assumption n spall (,j,)i $ m n span ($) /«„ — > and of symmetry of $. 
Hence we have (cf. Lemma [E.l[) that 

" B$B' 5$ 
$B' $ + D 



^ m ) 4 jv ( o, 



Note that this limiting normal distribution is also the joint distribution of K — B^ X I 2 G and L = 
($1/2 + £)i/2) G. [Observe that $ x / 2 + D 1 / 2 = ($ + £>) 1/2 since $L> = L>$ = as D vanishes on 
span($) by construction.] Now consider the map / on W l+k given by f(x, y) = {fi(x), /a(y), faiv)) 
where fi(x) = x for x G R k , and where fi{y) — p(y), fs{y) = ^~ a (y) for y G M™\7V* and are zero 
else. Observe that the set of discontinuity points, F say, of / is contained in M. k x N* . But 

Pr ((K, L)eF)< Pr ((K, L) G K fc x TV*) = Pr (L € N*) = (46) 

because N* is a AR^-null set and the distribution of L is equivalent to Lebesgue measure on K™ as 
$ + I? is positive definite. This shows that / (K m , L m ) converges in distribution to f (K,L) as 
m — > oo. Now 



T ( Mo + Z 7 + cr$^ 2 Gj - Cj = (i? (a- 1 ^^ + h (K m ) + s^ 2 h{L m )) ) h [L m ) 

xR (a- 1 BZ 1 + h (K rn ) + s]l 2 f 2 {L m )) ~ s m C 



holds everywhere (note that L m G M. n \N* if and only if $^ 2 G G R n \N* by G(9Jt)-invariance of 

l li 
K"\iV*). Because s™ fi\Lm) converges to zero in probability and s m C ->0we immediately see 

that the random variable in the preceding display converges in distribution to 

(R (a- l BZ 1 + A (b® 1 / 2 g))Y f z ((V /2 + L> 1/2 ) G) R (a^BZ-y + h (b<5> 1/2 g\\ 

which coincides with £ (7, a). Finally, the claim that { (-J 1 / 2 + D 1 / 2 ) G G R"\7V*} is a probability 
1 event has already been established in (|4"6")l . 

(2) This follows from Part 1 if we can establish that Pr (£ (7, a) = 0) = 0. Now observe that 
f)-i(($i/2 + £>i/2) G ) = n-i(£)i/2(3) by equivariance and that ($ x / 2 + D 1 / 2 ) G G R n \N* if and 

only if D 1 ' 2 G G R"\7V*. Hence 



Pr (£ (7, ct) = 0) = Pr f £ (7, a) = 0, ($ 1/2 + D 1 / 2 j G G R n \N* 
= Pr('(^( ( 7- 1 Z7 + $ 1 / 2 G))'i?'f2- 1 ( J D 1 / 2 G) 

x R (j3 (cr^Zj + $ 1/2 G)) = 0, D 1/2 G G M"\iV*) 
= fvT({p{a- 1 Z 1 + x)^R'(l- 1 {D 1 ' 2 G) 

x i?^(o-- 1 Z7 + a;)) =0,D 1/2 Gel"\r)d%(3:) 
= / Pr (($ {o- 1 Z 1 + x))' R'n- 1 ((V /2 + # 1/2 ) G) 

xflf^ (cx -1 ^ + x)\ = 0, ($ 1/2 + L> 1/2 ) G G M"\iV*) dP ,*(^) 

Po^+d {{y G R n \iV* : vixyn^iyWx) = 0}) dP 0) *(a;) (47) 



(iS 



with v(x) — R$ (a x Z~j + x), the third equality in the preceding display being true since ^' 2 G 
and D 1 / 2 G are independent as 

E { <P^ 2 G (d^g)' ) = $V2^1/2 = 



Now the integrand in the last line of (|47)) is zero by Assumption (0 for every x except when 
v(x) = 0. Hence, we are done if we can establish that Po,* (v(x) = 0) = 0. Because span($) equals 
the span of the columns of Z, we can make the change of variables x = Zc and obtain 

P ,* (v(x) = 0) = Po.a (v{Zc) = 0) = Po.a (R (p {Z (a- 1 ^ + c))) = o) 

where A = (Z'Z)~ Z'&Z (Z'Z)~ . Because A is non-singular, this probability is zero if the event 
has Adi -measure zero. But 



A K , ({c : Rp (Z (a- 1 "/ + c)) = o}) = A span($) ({* : RJ3 {z) = o}) = 

by our assumptions. 

Proof of Theorem 15. 19t Fix /i G 9Jlo and a, < a < oo. Then for every 7 e t ! we have 



D 



P Mo+ z 7 , CT ^ m (W(C)) = Pr (a m [T ( Mo + ^7 + ^{ 2 g) - C 



>0 



which converges to Pr (£ (7, cr) > 0) as shown in the preceding lemma (with S m and £ playing the 
roles of $ m and $, respectively). Consequently, for every 7 e M' 



But now 



inf. P^+z^s (W(CQ) < Pr (£ (7, o) > 0) 



lim inf inf Pr (£ (7, a) > 0) < lim inf inf Pr (£ (7, a) > 0) 

M->-oo || 7 ||>M M-yoo Rp(Z~/)^0,\\~f\\>M 



lim inf inf 



Pr(e(7^)/ll7l| 2 >0) 
e(7^)/ll7l| 2 >0) 



< lim inf inf Pr 

M-s-oo fl^(Z7)#0,||7|| = A/ 

< inf lim inf Pr (l (c, M, a) > 0) 

\\c\\ = l,Rp(Zc)^0 A/^oo 



where 



!(C Af.tr) 



R (p {Zc) + o£ (t l ' 2 G) /A/))' JT 1 (jt 1 ' 2 + D 1 ' 2 ) G 
P {Zc) + aP (E^g) /Al) 



on the event where (S 1 / 2 + D 1 / 2 ) G € M n \iV* and is zero else. The random variable £ (c, M, cr) 
converges in probability to the random variable £ (c) as M — > 00. Hence 

lim inf Pr (f (c, M, cr) > 0) = Pr (f (c) > 0) 
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holds for every c € R' satisfying ||c|| = 1 and R$ (Zc) ^ 0, because Pr (£ (c) = 0) = for such c in 
view of Assumption [7] observing that P f^+u is equivalent to Au» as £ + I? is nonsingular. This 
proves that 

liminf inf inf P u , z CT 2 E (W(C)) < inf Pr (l (c) > 0) 

= inf Pr (| (c) > 0) = inf Pr (£ (c) > 0) = K u 

||c||=i ceR' 

the first two equalities holding because £ (c) = if i?,/3 (Zc) = (and in particular if c = 0) and 
because Pr (£ (c) > 0) is homogenous in c. This establishes the first inequality in ([30]) because the 
left-most expression in (|30|) is monotonically increasing in M. Furthermore, 

su P P Mo , ct2s (W(C)) > P Mo , CT2Em (W(C)) , 
see 

and hence we obtain from Lemma IF. 21 that 

sup P^s (W(C)) > Pr (£ (0, a) > 0) 
sec 



Pr f (up (t l ' 2 G)) jr 1 ((s 1 / 2 + £>V2) G ) ^ (eV2 G ) > 0; 

f E 1/2 + P> 1/2 ) G e E"\7V* 



Now observe that fT 1 ((E 1 / 2 + D 1 / 2 ) G) = Q- 1 (D 1 / 2 G) by equivariance and that (S 1 / 2 + D 1 / 2 ) G G 
R"\7V* if and only if D 1 / 2 G e M n \iV*. Then by the same arguments as in (gTJ) we obtain 



Pr (£ (0, a) > 0) = / Pr f (#£ (a;) J ft" 1 ( (E 1/2 + £ 1/2 J Gj 

xR((3 (x)) > 0, (t 1 ' 2 + D 1 / 2 ) G e R n \iV*) dPo.sW 
Pr (I (7) > 0) dPo,A(7) = K 2 , 



the last equality resulting from the variable change x = Z7 which is possible since span(S) equals 
the space spanned by Z . Finally, the inequality K\ < Ki is obvious from the definition of these 
constants. ■ 

Proof of Theorem I5.2H Define ip = 1 (W(C)) and note that invariance of ip under G(9JTo) 
as well as the fact that (p is Ar™ -almost surely neither equal to or 1 follows from Lemma [5.151 
Part 1 of Theorem 15.101 then implies Part 1 of the theorem. Similarly, Parts 2 and 3 of the 
theorem follow from Parts 2 and 3 of Theorem 15.101 respectively, because condition (fT9|) follows 
from Part 7 of Lemma [5T5] combined with Remark l5.16l and because the lower bound in (l27|) equals 
1 under the assumptions of Part 3. To prove Part 4 we use Theorem 15. 121 Choose a sequence Cfc, 
< Cfc < 00, that diverges monotonically to infinity and set (p k — l(W(Ck))- Then (f23|) is 
satisfied by the monotone convergence theorem and the result follows from Theorem 15.121 upon 
setting C(S) = C ko{s) . ■ 



Lemma F.3. Let j3 and £1 satisfy Assumptions^ and^ Let T be the test statistic defined in i25\) 
and let £ be a covariance model. If there is a z £ span(J(£)) n 9JI with z ^ DJIq — Mo (i-c, with 
R/3(z) 7^ 0/, then T does not satisfy the invariance condition i31]) . 
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Proof. Choose z £ span (</(£)) D 971 with z ^ 97to — Mo- Because 97t is a linear space, we also have 
cz £ span ( J(£)) n97t for every ceM. Now cz £ 971 entails that j/ e M"\A* implies y + cz e W l \N*. 
Using the definition of T and Assumption [5] we obtain 

T{y + cz)=T (y) + 2c {R[3 (y) - r)' Cl' 1 (y) R(i(z) + c 2 (Rfi(z))' £r l (y) (rP(z) 

for every y £ M. n \N*. Because R$(z) ^ 0, we can in view of Assumption [7] find an y £ W l \N* such 
that 

holds. Hence T (y + cz) = T (y) cannot hold for the so-chosen y and all c ^ 0. Because cz £ 
span(J(C)), Remark lS.llf i) implies that condition ([31]) is not satisfied. □ 

Proof of Proposition I5T231 (1) By the assumed equivariance (invariance, respectively) of 8, 
fj, and N (and hence of N*) w.r.t. the transformations y H > ay + Xrj, the equivariance (invariance, 
respectively) of j3, J7, and N required in the original Assumption [5] is clearly satisfied. Now choose 
z £ J(£) and y £ R". If y £ N* then so is y + z because of invariance of N* and because 
z £ </(£) Q 971 holds by construction. Hence, T(y) = = T(y + z) is satisfied in this case. Now let 
y £ W n \N* (and hence also y + z_£ R n \N*). Note that £l(y) = tt(y + z) holds by equivariance. It 
remains to show that R/3(y) = Rf3(y + z). Because z £ J(£) C 971 we have z = Xj + (xi, . ■ . , x p ) 6 
and thus obtain 

R0(y + z) = (R,0)9(y + z) = (R,0) (%) + (Y, <*')') =R0(v) + R-y, (48) 

where we have made use of equivariance of 9. Now observe that (xi , . . . , x p ) 5 £ span ( J(£) U (97lo — Mo)) 
by construction of the x%. Hence, we can find an element /i* £ 97t such that {x\,... ,x p ) 5 — 
( /Iq — fi Q ) G span ( J (<£)). Consequently, we obtain 



{{xi,...,Xp)S- (jj,f -Mo)) = x l + \f4 ~Mo) 



The left-hand side is obviously an element of span (</(£)), while the right-hand side belongs to 971, 
implying that the right-hand side is in span J(<£) fl 97t which is a subset of 97to — Mo by assumption. 
Because /if — [i £ 97t — fi , we have established that A 7 e 97l — Mo> or m other words, that 
Rj = 0. 

(2) The very first claim is obvious. If z £ span ( </(£)) then again we have z — A7 + (x\ , . . . , x p ) d 
and 8 (z) = (7', 8) . Now Rfi (z) = (R, 0) 8 (z) = Ry and exactly the same argument as above shows 
that i?7 = 0. For the last claim note that X8 (y) — X*8* (y) holds because X and X* span the 
same space. This equality can be written as 

xp(y)-xp*{ y ) 

Because the right-hand side of the above equation belongs to span (J(<£) U (97to — Mo)) we can ^ n( ^ 
f-i* £ 97to such that the right-hand side of 

x (fi(y) - /3*(y)) - (4 - mo) =$>?**+< (w) -E^fcw (y) - (4 - m ) 



p 


P 


Y, x * e Ui(y)- 


-^2x l 9 k+i (y) 


8=1 


8=1 



j=l 
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belongs to span ( </(<£)) while the left-hand side belongs to 9Jt. Arguing now similarly as in the proof 
of Part 1, we conclude that R/3(y) — Rj3*(y). ■ 

G Appendix: Properties of AR-Correlation Matrices 

Lemma G.l. 1. Suppose the covariance model € contains A(p m ) for some sequence p m G 
(—1,1) with p m — y 1 (p m — > —1, respectively). Then span(e+) fspan(e_), respectively) 
is a concentration space of £. 

2. £ar(i) ftasspan(e+) and span (e_) as its only concentration spaces. Consequently, J(£ar(i)) = 
span(e+) U span(e_). 



3. If p m G (—1,1) is a sequence converging to 1 then H m = A(p TO ) satisfies E m — > £ = e + e' + 
and D m = n^iU^jji/Jn -»• D as well as n pan (s) iS ™ n spa n(E)/«- 2 -► ° w/lere 

s m = tr II /=nj_S to II /•f.\- L ) converges to zero and D is the matrix with (i,j)-th element 

—n \i — j\ I J2i j \i — j\ P re ~ an d postmultiplied by (l n — n~ 1 e+e' + ) . Furthermore, D is regular 
on span (S) . 

4- If P m G (—1,1) is a sequence converging to —1 then £ m = A(p m ) satisfies S TO — > £ = e_e'_ 
and D m = n^iU^iA™ -»• £> ^ wetf as n^^jxE™^^^^, 1 / 2 -4 where 

s m = tr I II /-nj_S to II /-n_l I converges to zero and D is the matrix with (i,j)-th element 

n(— l)' l_ -'' +1 |i — j\ I *Yl,i j \i ~ j\ P re ~ an d postmultiplied by (l n — n~ 1 e-e'_) . Furthermore, D 
is regular on span (2) . 

Proof. (1) and (2) are obvious. 

(3) Because II ^u E m n ^ii is nonnegativc definite, but obviously different from the zero 

Sp&ll I Z_j I SpELIl I Zj J 

matrix (recall that n > 1 is assumed), we see that s m is always positive. Clearly, II ,su £ m II /^^ 
converges to II ,.=a_l2II ^ii = and hence s m — >• 0. By l'Hopital's rule the limit of D m can 

SpELIl I Zj I SpElIl I Z_i ) 

be obtained as the limit of II ^ii (dA/dp) (p )II ^u divided by the limit of 

Sp&ll I Z_j I SpELIl I Z_j 1 



tr ( v n s pa„(s)-( rfA /^)(^) n span(s)- 

provided the latter is nonzero. The second limit now equals 

tr ((/„ - n- 1 e+e+) (dA/dp) (1) (/„ - n,- 1 e+e' + )) = tr ((dA/dp) (1) (l n - n,- 1 e+e+)) 

= tr ((dA/dp) (1)) - n- 1 tr (e+ (dA/dp) (l)e+) . 

Observe that the (i,j)-th element of the matrix (dA/dp) (1) is given by \i — j\. Hence, the above 
expression equals 

-n- 1 tr (e' + (dA/dp) (l)e+) = -n" 1 ^ |i - j\ , 

»>i 
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which is clearly nonzero. The first limit exists and equals 

(I n - n _1 e+e' + ) (dA/dp) (1) (l n - n _1 e+e+) 

which shows that D is of the form as claimed in the lemma. We next show that D is regular on 
span (£J = span(e + )^. This is equivalent to showing that the equation system 

(dA/dp) (l)x + Ae+ = 
e' x = 



has x = 0, A = as its only solution. We hence need to show that the (n + 1) X (n + 1) matrix 

A = 



(dA/dp) (1) e+ 
e' 



has rank n + 1. Let _B be the (n + 1) x (n + 1) matrix given by 



I? 



Sir 
1 



where the n x n matrix B\\ has 1 everywhere on the main diagonal, —1 everywhere on the first 
off-diagonal above the main diagonal, and zeroes elsewhere. Let the (n + 1) x (n + 1) matrices B* 
and _B** be given by 



B* = 



1 

In 



B** = 



./' 



where / = — (n — 1, n — 2, n — 3, . . . , 1, 0). Observe that B, B* , as well as £?** are non-singular 
and that 



B*BAB** = C = 







1 



where Cn is an n x n matrix that has 1 everywhere on and above the diagonal and —1 everywhere 
below the diagonal. Obviously, C is nonsingular and hence A is so. Finally, we show that the limit 

1/2 

n^s)/ 5 ™ equals zero. Because s m — > 0, it suffices to show that the limit of 
/ s m exists and is finite. Now the same arguments as above show that the 



of n 
n 



- -^mll 



pan(SJ 
span(£) iSmn span(E) 

latter limit is equal to (/„ — n _1 e + e' + ) (dA/dp) (l)n~ 1 e + e' + divided by — n _1 J^ \i — j\. 

(4) For the same reasons as in (3) s m is positive and converges to zero. By the same argument 
as in (3) the limit of D m is 



[(/„ - n-W.) (dA/dp) (-1) (I n - n-^-eL)] /tr ((/„ - n^e-eL) (dA/dp) (-1) (l n 
Note that the denominator is equal to 



-W_)) 



tr ((dA/dp) (-1)) - n" 1 tr (e'_ (rfA/dp) (-l)e_) = 



"E 



K-il^o, 



observing that the (i,j)-th element of (dA/dp) (—1) is given by (—1)'* J ' +1 |i — j|- We next show 
that D is regular on span (SJ = span(e_) ± . This is equivalent to showing that the equation 
system 

(dA/dp) (-!> + Ae_ = 
e'_x = 
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has x = 0, A = as its only solution. We hence need to show that the (n + 1) x (n + 1) matrix 

A* 



(dA/dp)(-l) e_ 
e' 



has rank n+1. Note that this is equivalent to establishing that the matrix 

A* 



(dA/dp)(-l) (-1)"+V 
(-1)™+V_ 



is nonsingular. Now note that 

A f = -EAE 

where A is as in (3) and E is an (n+1) x (n+1) diagonal matrix with the i-th diagonal element given 
by (—1)*. This proves regularity of D on span (S) . The claim for II /^NiE m II s ,^\/sJi ^ s 
proved as in (3). □ 

Lemma G.2. For every v E [0,7r] there exists a sequence S m E £ar(2) converging to E[y)E[y)' . 

Proof. For v — (y = tt, respectively) the matrix E(y)E(v)' equals e+e' + (e_e'_, respectively), and 
the result thus follows from Lemma |G. II Hence assume that v E (0, tt). Consider for < r < 1 the 
AR(2)-spectral density 

f r (uj) = (2tt) c(r) |l — 2r cos (is) exp(— tw) + r 2 exp(— 2iw)| 

where 

c (r) = (1 - r 2 ) ((1 + r 2 ) 2 - 4r 2 cos 2 (i/)) (l + r 2 )" 1 . 

Observe that Jf r (ui)dui = 1 where the integral extends over [— tt, 7f]. Hence the nxn variance 
covariance matrix E (r) corresponding to / r belongs to Ca.r(2) • Let e > be given and set A (e) — 

{u e [— 7T, 7r] : |w — i/| > e} U {uj E [— it, tt] : \u + v\ > e}. Then it is easy to see that 

sup |/ r (w)|-»0 for r — >• 1. 

Consequently, for every 5 > and every s > there exists an < r (e, S) < 1 such that 

/ r (u>) duj > 1 — S 



[-ir,*]\A( S ) 

holds for all r satisfying r (e, 6) < r < 1. In view of symmetry of / r around iq = 0, this shows that 
for r sufficiently close to 1 the spectral density / r is arbitrarily small outside of the union of the 
neighborhoods \u) — v\ < e and \u + u\ < e and puts mass arbitrarily close to 1/2 on each one of the 
two neighborhoods. A standard argument then shows for every continuous function g on [—tt, tt] 
that 

/ g(w)f r (u;)du;->0.5g(u) + 0.5g(-v)= f g (w) d {0.5S V + 0.56-^) 

[ — 7T,7r] [ — TT.ir] 

where 5 X denotes unit pointmass at x. Specializing to g (uj) — exjp(—duj) shows that S (r) converges 
to E{v)E(v)'. □ 
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Using the arguments in the above proof it is actually not difficult to show that the closure of 
the set of AR(2)-spectral densities in the weak topology is the class of AR(2)-spectral densities plus 
all spectral measures of the form 0.5S V + 0.55- v for v 6 [0,7r]. This result extends in an obvious 
way to higher-order autoregressive models and has an app ropriate generalization to (m ultivariate) 
autoregressive moving average models, see Theorem 4.1 in lDeistler and Potscherl ( 19841 ). 
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